Spaces:
Running
Running
ο»Ώ# π OmniAvatar API Documentation | |
## POST /generate - Avatar Generation | |
### Request Format | |
**URL:** `https://huggingface.co/spaces/bravedims/AI_Avatar_Chat/api/generate` | |
**Method:** `POST` | |
**Content-Type:** `application/json` | |
### Request Body (JSON) | |
```json | |
{ | |
"prompt": "string", | |
"text_to_speech": "string (optional)", | |
"elevenlabs_audio_url": "string (optional)", | |
"voice_id": "string (optional, default: '21m00Tcm4TlvDq8ikWAM')", | |
"image_url": "string (optional)", | |
"guidance_scale": "float (default: 5.0)", | |
"audio_scale": "float (default: 3.0)", | |
"num_steps": "int (default: 30)", | |
"sp_size": "int (default: 1)", | |
"tea_cache_l1_thresh": "float (optional)" | |
} | |
``` | |
### Request Parameters | |
| Field | Type | Required | Description | | |
|-------|------|----------|-------------| | |
| `prompt` | string | β | Character behavior description | | |
| `text_to_speech` | string | β | Text to convert to speech via ElevenLabs | | |
| `elevenlabs_audio_url` | string | β | Direct URL to audio file | | |
| `voice_id` | string | β | ElevenLabs voice ID (default: Rachel) | | |
| `image_url` | string | β | Reference image URL | | |
| `guidance_scale` | float | β | Prompt following strength (4-6 recommended) | | |
| `audio_scale` | float | β | Lip-sync accuracy (3-5 recommended) | | |
| `num_steps` | int | β | Generation steps (20-50 recommended) | | |
| `sp_size` | int | β | Parallel processing size | | |
| `tea_cache_l1_thresh` | float | β | Cache threshold optimization | | |
**Note:** Either `text_to_speech` OR `elevenlabs_audio_url` must be provided. | |
### Example Request | |
```json | |
{ | |
"prompt": "A professional teacher explaining a mathematical concept with clear gestures", | |
"text_to_speech": "Hello students! Today we're going to learn about calculus and how derivatives work in real life.", | |
"voice_id": "21m00Tcm4TlvDq8ikWAM", | |
"image_url": "https://example.com/teacher.jpg", | |
"guidance_scale": 5.0, | |
"audio_scale": 3.5, | |
"num_steps": 30 | |
} | |
``` | |
### Response Format | |
**Success Response (200 OK):** | |
```json | |
{ | |
"message": "string", | |
"output_path": "string", | |
"processing_time": "float", | |
"audio_generated": "boolean" | |
} | |
``` | |
### Response Fields | |
| Field | Type | Description | | |
|-------|------|-------------| | |
| `message` | string | Success/status message | | |
| `output_path` | string | Path to generated video file | | |
| `processing_time` | float | Processing time in seconds | | |
| `audio_generated` | boolean | Whether audio was generated from text | | |
### Example Response | |
```json | |
{ | |
"message": "Avatar generation completed successfully", | |
"output_path": "./outputs/avatar_20240807_130512.mp4", | |
"processing_time": 45.67, | |
"audio_generated": true | |
} | |
``` | |
### Error Responses | |
**400 Bad Request:** | |
```json | |
{ | |
"detail": "Either text_to_speech or elevenlabs_audio_url must be provided" | |
} | |
``` | |
**500 Internal Server Error:** | |
```json | |
{ | |
"detail": "Model not loaded" | |
} | |
``` | |
**503 Service Unavailable:** | |
```json | |
{ | |
"detail": "Model not loaded" | |
} | |
``` | |
### Available ElevenLabs Voices | |
| Voice ID | Name | Description | | |
|----------|------|-------------| | |
| `21m00Tcm4TlvDq8ikWAM` | Rachel | Default, clear female voice | | |
| `pNInz6obpgDQGcFmaJgB` | Adam | Professional male voice | | |
| `EXAVITQu4vr4xnSDxMaL` | Bella | Expressive female voice | | |
### Usage Examples | |
#### With Text-to-Speech | |
```bash | |
curl -X POST "https://huggingface.co/spaces/bravedims/AI_Avatar_Chat/api/generate" \ | |
-H "Content-Type: application/json" \ | |
-d '{ | |
"prompt": "A friendly presenter speaking confidently", | |
"text_to_speech": "Welcome to our AI avatar demonstration!", | |
"voice_id": "21m00Tcm4TlvDq8ikWAM", | |
"guidance_scale": 5.5, | |
"audio_scale": 4.0 | |
}' | |
``` | |
#### With Audio URL | |
```bash | |
curl -X POST "https://huggingface.co/spaces/bravedims/AI_Avatar_Chat/api/generate" \ | |
-H "Content-Type: application/json" \ | |
-d '{ | |
"prompt": "A news anchor delivering headlines", | |
"elevenlabs_audio_url": "https://example.com/audio.mp3", | |
"image_url": "https://example.com/anchor.jpg", | |
"num_steps": 40 | |
}' | |
``` | |
### Other Endpoints | |
#### GET /health - Health Check | |
```json | |
{ | |
"status": "healthy", | |
"model_loaded": true, | |
"device": "cuda", | |
"supports_elevenlabs": true, | |
"supports_image_urls": true, | |
"supports_text_to_speech": true, | |
"elevenlabs_api_configured": true | |
} | |
``` | |
#### GET /docs - FastAPI Documentation | |
Interactive API documentation available at `/docs` endpoint. | |
### Rate Limits & Performance | |
- **Processing Time:** 30-120 seconds depending on complexity | |
- **Max Video Length:** Determined by audio length | |
- **Supported Formats:** MP4 output, MP3/WAV audio input | |
- **GPU Acceleration:** Enabled on T4+ hardware | |
--- | |
**Live API Base URL:** `https://huggingface.co/spaces/bravedims/AI_Avatar_Chat` | |