title: AI Animation & Voice Studio
emoji: π¬
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
suggested_hardware: cpu-upgrade
suggested_storage: large
pinned: true
license: apache-2.0
short_description: Create mathematical animations with AI-powered using Manim
tags:
- text-to-speech
- animation
- mathematics
- manim
- ai-voice
- educational
- visualization
models:
- kokoro-onnx/kokoro-v0_19
datasets: []
startup_duration_timeout: 30m
fullWidth: true
header: default
disable_embedding: false
preload_from_hub: []
AI Animation & Voice Studio π¬
A powerful application that combines AI-powered text-to-speech with mathematical animation generation using Manim and Kokoro TTS. Create stunning educational content with synchronized voice narration and mathematical visualizations.
π Features
- Text-to-Speech: High-quality voice synthesis using Kokoro ONNX models
- Mathematical Animations: Create stunning mathematical visualizations with Manim
- LaTeX Support: Full LaTeX rendering capabilities with TinyTeX
- Interactive Interface: User-friendly Gradio web interface
- Audio Processing: Advanced audio manipulation with FFmpeg and SoX
π οΈ Technology Stack
- Frontend: Gradio for interactive web interface
- Backend: Python with FastAPI/Flask
- Animation: Manim (Mathematical Animation Engine)
- TTS: Kokoro ONNX for text-to-speech synthesis
- LaTeX: TinyTeX for mathematical typesetting
- Audio: FFmpeg, SoX, PortAudio for audio processing
- Deployment: Docker container optimized for Hugging Face Spaces
π¦ Models
This application uses the following pre-trained models:
- Kokoro TTS:
kokoro-v0_19.onnx
- High-quality neural text-to-speech model - Voice Models:
voices.bin
- Voice embedding models for different speaker characteristics
Models are automatically downloaded during the Docker build process from the official releases.
πββοΈ Quick Start
Using Hugging Face Spaces
- Visit the Space
- Wait for the container to load (initial startup may take 3-5 minutes due to model loading)
- Upload your script or enter text directly
- Choose animation settings and voice parameters
- Generate your animated video with AI narration!
Local Development
# Clone the repository
git clone https://huggingface.co/spaces/your-username/ai-animation-voice-studio
cd ai-animation-voice-studio
# Build the Docker image
docker build -t ai-animation-studio .
# Run the container
docker run -p 7860:7860 ai-animation-studio
Access the application at http://localhost:7860
Environment Setup
Create a .env
file with your configuration:
# Application settings
DEBUG=false
MAX_WORKERS=4
# Model settings
MODEL_PATH=/app/models
CACHE_DIR=/tmp/cache
# Optional: API keys if needed
# OPENAI_API_KEY=your_key_here
π― Usage Examples
Basic Text-to-Speech
# Example usage in your code
from src.tts import generate_speech
audio = generate_speech(
text="Hello, this is a test of the text-to-speech system",
voice="default",
speed=1.0
)
Mathematical Animation
# Example Manim scene
from manim import *
class Example(Scene):
def construct(self):
# Your animation code here
pass
π Project Structure
βββ src/ # Source code
β βββ tts/ # Text-to-speech modules
β βββ manim_scenes/ # Manim animation scenes
β βββ utils/ # Utility functions
βββ models/ # Pre-trained models (auto-downloaded)
βββ output/ # Generated content output
βββ requirements.txt # Python dependencies
βββ Dockerfile # Container configuration
βββ gradio_app.py # Main application entry point
βββ README.md # This file
βοΈ Configuration
Docker Environment Variables
GRADIO_SERVER_NAME
: Server host (default: 0.0.0.0)GRADIO_SERVER_PORT
: Server port (default: 7860)PYTHONPATH
: Python path configurationHF_HOME
: Hugging Face cache directory
Application Settings
Modify settings in your .env
file or through environment variables:
- Model parameters
- Audio quality settings
- Animation render settings
- Cache configurations
π§ Development
Prerequisites
- Docker and Docker Compose
- Python 3.12+
- Git
Setting Up Development Environment
# Install dependencies locally for development
pip install -r requirements.txt
# Run tests (if available)
python -m pytest tests/
# Format code
black .
isort .
# Lint code
flake8 .
Building and Testing
# Build the Docker image
docker build -t your-app-name:dev .
# Test the container locally
docker run --rm -p 7860:7860 your-app-name:dev
# Check container health
docker run --rm your-app-name:dev python -c "import src; print('Import successful')"
π Performance & Hardware
Recommended Specs for Hugging Face Spaces
- Hardware:
cpu-upgrade
(recommended for faster rendering) - Storage:
small
(sufficient for models and temporary files) - Startup Time: ~3-5 minutes (due to model loading and TinyTeX setup)
- Memory Usage: ~2-3GB during operation
System Requirements
- Memory: Minimum 2GB RAM, Recommended 4GB+
- CPU: Multi-core processor recommended for faster animation rendering
- Storage: ~1.5GB for models and dependencies
- Network: Stable connection for initial model downloads
Optimization Tips
- Models are cached after first download
- Gradio interface uses efficient streaming for large outputs
- Docker multi-stage builds minimize final image size
- TinyTeX installation is optimized for essential packages only
π Troubleshooting
Common Issues
Build Failures:
# Clear Docker cache if build fails
docker system prune -a
docker build --no-cache -t your-app-name .
Model Download Issues:
- Check internet connection
- Verify model URLs are accessible
- Models will be re-downloaded if corrupted
Memory Issues:
- Reduce batch sizes in configuration
- Monitor memory usage with
docker stats
Audio Issues:
- Ensure audio drivers are properly installed
- Check PortAudio configuration
Getting Help
- Check the Discussions tab
- Review container logs in the Space settings
- Enable debug mode in configuration
- Report issues in the Community tab
Common Configuration Issues
Space Configuration:
- Ensure
app_port: 7860
is set in README.md front matter - Check that
sdk: docker
is properly configured - Verify hardware suggestions match your needs
Model Loading:
- Models download automatically on first run
- Check Space logs for download progress
- Restart Space if models fail to load
π€ Contributing
We welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
Code Style
- Follow PEP 8 for Python code
- Use Black for code formatting
- Add docstrings for functions and classes
- Include type hints where appropriate
π License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
π Acknowledgments
- Manim Community for the animation engine
- Kokoro TTS for text-to-speech models
- Gradio for the web interface framework
- Hugging Face for hosting and infrastructure
π Contact
- Author: Your Name
- Email: [email protected]
- GitHub: @your-username
- Hugging Face: @your-username
Built with β€οΈ for the open-source community