metadata

title: Multimodal AI Backend Service
emoji: 🚀
colorFrom: yellow
colorTo: purple
sdk: docker
app_port: 8000
pinned: false

firstAI - Multimodal AI Backend 🚀

A powerful AI backend service with multimodal capabilities - supporting both text generation and image analysis using transformers pipelines.

🎉 Features

🤖 Dual AI Models

Text Generation: Microsoft DialoGPT-medium for conversations
Image Analysis: Salesforce BLIP for image captioning and visual Q&A

🖼️ Multimodal Support

Process text-only messages
Analyze images from URLs
Combined image + text conversations
OpenAI Vision API compatible format

🔧 Production Ready

FastAPI backend with automatic docs
Comprehensive error handling
Health checks and monitoring
PyTorch with MPS acceleration (Apple Silicon)

🚀 Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Start the Service

python backend_service.py

3. Test Multimodal Capabilities

python test_final.py

The service will start on http://localhost:8001 with both text and vision models loaded.

💡 Usage Examples

Text-Only Chat

curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/DialoGPT-medium",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Image Analysis

curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Salesforce/blip-image-captioning-base",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "url": "https://example.com/image.jpg"
          }
        ]
      }
    ]
  }'

Multimodal (Image + Text)

curl -X POST http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Salesforce/blip-image-captioning-base",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "image",
            "url": "https://example.com/image.jpg"
          },
          {
            "type": "text",
            "text": "What do you see in this image?"
          }
        ]
      }
    ]
  }'

🔧 Technical Details

Architecture

FastAPI web framework
Transformers pipeline for AI models
PyTorch backend with GPU/MPS support
Pydantic for request/response validation

Models

Text: microsoft/DialoGPT-medium
Vision: Salesforce/blip-image-captioning-base

API Endpoints

GET / - Service information
GET /health - Health check
GET /v1/models - List available models
POST /v1/chat/completions - Chat completions (text/multimodal)
GET /docs - Interactive API documentation

🧪 Testing

Run the comprehensive test suite:

python test_final.py

Test individual components:

python test_multimodal.py  # Basic multimodal tests
python test_pipeline.py    # Pipeline compatibility

📦 Dependencies

Key packages:

fastapi - Web framework
transformers - AI model pipelines
torch - PyTorch backend
Pillow - Image processing
accelerate - Model acceleration
requests - HTTP client

🎯 Integration Complete

This project successfully integrates: ✅ Transformers image-text-to-text pipeline
✅ OpenAI Vision API compatibility
✅ Multimodal message processing
✅ Production-ready FastAPI service

See MULTIMODAL_INTEGRATION_COMPLETE.md for detailed integration documentation.

PyTorch with MPS acceleration (Apple Silicon) AI Backend Service emoji: � colorFrom: yellow colorTo: purple sdk: fastapi sdk_version: 0.100.0 app_file: backend_service.py pinned: false

AI Backend Service 🚀

Status: ✅ CONVERSION COMPLETE!

Successfully converted from a non-functioning Gradio HuggingFace app to a production-ready FastAPI backend service with OpenAI-compatible API endpoints.

Quick Start

1. Setup Environment

# Activate the virtual environment
source gradio_env/bin/activate

# Install dependencies (already done)
pip install -r requirements.txt

2. Start the Backend Service

python backend_service.py --port 8000 --reload

3. Test the API

# Run comprehensive tests
python test_api.py

# Or try usage examples
python usage_examples.py

API Endpoints

Endpoint	Method	Description
`/`	GET	Service information
`/health`	GET	Health check
`/v1/models`	GET	List available models
`/v1/chat/completions`	POST	Chat completion (OpenAI compatible)
`/v1/completions`	POST	Text completion

Example Usage

Chat Completion

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/DialoGPT-medium",
    "messages": [
      {"role": "user", "content": "Hello! How are you?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7
  }'

Streaming Chat

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "microsoft/DialoGPT-medium",
    "messages": [
      {"role": "user", "content": "Tell me a joke"}
    ],
    "stream": true
  }'

Files

app.py - Original Gradio ChatInterface (still functional)
backend_service.py - New FastAPI backend service ⭐
test_api.py - Comprehensive API testing
usage_examples.py - Simple usage examples
requirements.txt - Updated dependencies
CONVERSION_COMPLETE.md - Detailed conversion documentation

Features

✅ OpenAI-Compatible API - Drop-in replacement for OpenAI API
✅ Async FastAPI - High-performance async architecture
✅ Streaming Support - Real-time response streaming
✅ Error Handling - Robust error handling with fallbacks
✅ Production Ready - CORS, logging, health checks
✅ Docker Ready - Easy containerization
✅ Auto-reload - Development-friendly auto-reload
✅ Type Safety - Full type hints with Pydantic validation

Service URLs

Backend Service: http://localhost:8000
API Documentation: http://localhost:8000/docs
OpenAPI Spec: http://localhost:8000/openapi.json

Model Information

Current Model: microsoft/DialoGPT-medium
Type: Conversational AI model
Provider: HuggingFace Inference API
Capabilities: Text generation, chat completion

Architecture

┌─────────────────────┐    ┌──────────────────────┐    ┌─────────────────────┐
│   Client Request    │───▶│   FastAPI Backend    │───▶│  HuggingFace API    │
│  (OpenAI format)    │    │  (backend_service)   │    │  (DialoGPT-medium)  │
└─────────────────────┘    └──────────────────────┘    └─────────────────────┘
                                       │
                                       ▼
                           ┌──────────────────────┐
                           │   OpenAI Response    │
                           │   (JSON/Streaming)   │
                           └──────────────────────┘

Development

The service includes:

Auto-reload for development
Comprehensive logging for debugging
Type checking for code quality
Test suite for reliability
Error handling for robustness

Production Deployment

Ready for production with:

Environment variables for configuration
Health check endpoints for monitoring
CORS support for web applications
Docker compatibility for containerization
Structured logging for observability

🎉 Conversion Status: COMPLETE!
Successfully transformed from broken Gradio app to production-ready AI backend service.

For detailed conversion documentation, see CONVERSION_COMPLETE.md.