Spaces:

cong182
/

firstAI

Running

File size: 8,672 Bytes

6b91eb5
d3ad561
 
6b91eb5
 
b6cf19e
 
6b91eb5
 
 
d3ad561

---
title: Multimodal AI Backend Service
emoji: 🚀
colorFrom: yellow
colorTo: purple
sdk: docker
app_port: 8000
pinned: false
---

# firstAI - Multimodal AI Backend 🚀

A powerful AI backend service with **multimodal capabilities** - supporting both text generation and image analysis using transformers pipelines.

## 🎉 Features

### 🤖 Dual AI Models

- **Text Generation**: Microsoft DialoGPT-medium for conversations
- **Image Analysis**: Salesforce BLIP for image captioning and visual Q&A

### 🖼️ Multimodal Support

- Process text-only messages
- Analyze images from URLs
- Combined image + text conversations
- OpenAI Vision API compatible format

### 🔧 Production Ready

- FastAPI backend with automatic docs
- Comprehensive error handling
- Health checks and monitoring
- PyTorch with MPS acceleration (Apple Silicon)

## 🚀 Quick Start

### 1. Install Dependencies

```bash
pip install -r requirements.txt
```

### 2. Start the Service

```bash
python backend_service.py
```

### 3. Test Multimodal Capabilities

```bash
python test_final.py
```

The service will start on **http://localhost:8001** with both text and vision models loaded.

## 💡 Usage Examples

### Text-Only Chat

```bash
curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/DialoGPT-medium",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
```

### Image Analysis

```bash
curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Salesforce/blip-image-captioning-base",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "url": "https://example.com/image.jpg"
          }
        ]
      }
    ]
  }'
```

### Multimodal (Image + Text)

```bash
curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Salesforce/blip-image-captioning-base",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "url": "https://example.com/image.jpg"
          },
          {
            "type": "text",
            "text": "What do you see in this image?"
          }
        ]
      }
    ]
  }'
```

## 🔧 Technical Details

### Architecture

- **FastAPI** web framework
- **Transformers** pipeline for AI models
- **PyTorch** backend with GPU/MPS support
- **Pydantic** for request/response validation

### Models

- **Text**: microsoft/DialoGPT-medium
- **Vision**: Salesforce/blip-image-captioning-base

### API Endpoints

- `GET /` - Service information
- `GET /health` - Health check
- `GET /v1/models` - List available models
- `POST /v1/chat/completions` - Chat completions (text/multimodal)
- `GET /docs` - Interactive API documentation

## 🧪 Testing

Run the comprehensive test suite:

```bash
python test_final.py
```

Test individual components:

```bash
python test_multimodal.py  # Basic multimodal tests
python test_pipeline.py    # Pipeline compatibility
```

## 📦 Dependencies

Key packages:

- `fastapi` - Web framework
- `transformers` - AI model pipelines
- `torch` - PyTorch backend
- `Pillow` - Image processing
- `accelerate` - Model acceleration
- `requests` - HTTP client

## 🎯 Integration Complete

This project successfully integrates:
✅ **Transformers image-text-to-text pipeline**  
✅ **OpenAI Vision API compatibility**  
✅ **Multimodal message processing**  
✅ **Production-ready FastAPI service**

See `MULTIMODAL_INTEGRATION_COMPLETE.md` for detailed integration documentation.

- PyTorch with MPS acceleration (Apple Silicon) AI Backend Service
  emoji: �
  colorFrom: yellow
  colorTo: purple
  sdk: fastapi
  sdk_version: 0.100.0
  app_file: backend_service.py
  pinned: false

---

# AI Backend Service 🚀

**Status: ✅ CONVERSION COMPLETE!**

Successfully converted from a non-functioning Gradio HuggingFace app to a production-ready FastAPI backend service with OpenAI-compatible API endpoints.

## Quick Start

### 1. Setup Environment

```bash
# Activate the virtual environment
source gradio_env/bin/activate

# Install dependencies (already done)
pip install -r requirements.txt
```

### 2. Start the Backend Service

```bash
python backend_service.py --port 8000 --reload
```

### 3. Test the API

```bash
# Run comprehensive tests
python test_api.py

# Or try usage examples
python usage_examples.py
```

## API Endpoints

| Endpoint               | Method | Description                         |
| ---------------------- | ------ | ----------------------------------- |
| `/`                    | GET    | Service information                 |
| `/health`              | GET    | Health check                        |
| `/v1/models`           | GET    | List available models               |
| `/v1/chat/completions` | POST   | Chat completion (OpenAI compatible) |
| `/v1/completions`      | POST   | Text completion                     |

## Example Usage

### Chat Completion

```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/DialoGPT-medium",
    "messages": [
      {"role": "user", "content": "Hello! How are you?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7
  }'
```

### Streaming Chat

```bash
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/DialoGPT-medium",
    "messages": [
      {"role": "user", "content": "Tell me a joke"}
    ],
    "stream": true
  }'
```

## Files

- **`app.py`** - Original Gradio ChatInterface (still functional)
- **`backend_service.py`** - New FastAPI backend service ⭐
- **`test_api.py`** - Comprehensive API testing
- **`usage_examples.py`** - Simple usage examples
- **`requirements.txt`** - Updated dependencies
- **`CONVERSION_COMPLETE.md`** - Detailed conversion documentation

## Features

✅ **OpenAI-Compatible API** - Drop-in replacement for OpenAI API  
✅ **Async FastAPI** - High-performance async architecture  
✅ **Streaming Support** - Real-time response streaming  
✅ **Error Handling** - Robust error handling with fallbacks  
✅ **Production Ready** - CORS, logging, health checks  
✅ **Docker Ready** - Easy containerization  
✅ **Auto-reload** - Development-friendly auto-reload  
✅ **Type Safety** - Full type hints with Pydantic validation

## Service URLs

- **Backend Service**: http://localhost:8000
- **API Documentation**: http://localhost:8000/docs
- **OpenAPI Spec**: http://localhost:8000/openapi.json

## Model Information

- **Current Model**: `microsoft/DialoGPT-medium`
- **Type**: Conversational AI model
- **Provider**: HuggingFace Inference API
- **Capabilities**: Text generation, chat completion

## Architecture

```
┌─────────────────────┐    ┌──────────────────────┐    ┌─────────────────────┐
│   Client Request    │───▶│   FastAPI Backend    │───▶│  HuggingFace API    │
│  (OpenAI format)    │    │  (backend_service)   │    │  (DialoGPT-medium)  │
└─────────────────────┘    └──────────────────────┘    └─────────────────────┘
                                       │
                                       ▼
                           ┌──────────────────────┐
                           │   OpenAI Response    │
                           │   (JSON/Streaming)   │
                           └──────────────────────┘
```

## Development

The service includes:

- **Auto-reload** for development
- **Comprehensive logging** for debugging
- **Type checking** for code quality
- **Test suite** for reliability
- **Error handling** for robustness

## Production Deployment

Ready for production with:

- **Environment variables** for configuration
- **Health check endpoints** for monitoring
- **CORS support** for web applications
- **Docker compatibility** for containerization
- **Structured logging** for observability

---

**🎉 Conversion Status: COMPLETE!**  
Successfully transformed from broken Gradio app to production-ready AI backend service.

For detailed conversion documentation, see [`CONVERSION_COMPLETE.md`](CONVERSION_COMPLETE.md).