Update README.md
Browse files
README.md
CHANGED
@@ -1,348 +1,306 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
|
5 |
-
|
6 |
|
7 |
-
|
8 |
-
[](https://github.com/TIGER-AI-Lab/TheoremExplainAgent/blob/main/LICENSE)
|
9 |
-
[](https://github.com/TIGER-AI-Lab/TheoremExplainAgent)
|
10 |
-
[](https://hits.seeyoufarm.com)
|
11 |
|
12 |
-
|
13 |
|
14 |
-
|
15 |
-
|
|
|
|
|
16 |
|
|
|
|
|
|
|
17 |
|
|
|
|
|
|
|
18 |
|
19 |
-
|
20 |
|
|
|
21 |
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
-
|
28 |
|
29 |
-
|
|
|
|
|
30 |
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
pip install -r requirements.txt
|
36 |
```
|
37 |
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
```
|
44 |
|
45 |
-
|
46 |
|
47 |
-
|
48 |
-
mkdir -p models && wget -P models https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx && wget -P models https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.bin
|
49 |
-
```
|
50 |
|
51 |
-
|
52 |
-
|
|
|
|
|
53 |
|
54 |
-
|
55 |
-
touch .env
|
56 |
-
```
|
57 |
-
Then open the `.env` file and edit it with whatever text editor you like.
|
58 |
-
|
59 |
-
Your `.env` file should look like the following:
|
60 |
-
```shell
|
61 |
-
# OpenAI
|
62 |
-
OPENAI_API_KEY=""
|
63 |
-
|
64 |
-
# Azure OpenAI
|
65 |
-
AZURE_API_KEY=""
|
66 |
-
AZURE_API_BASE=""
|
67 |
-
AZURE_API_VERSION=""
|
68 |
-
|
69 |
-
# Google Vertex AI
|
70 |
-
VERTEXAI_PROJECT=""
|
71 |
-
VERTEXAI_LOCATION=""
|
72 |
-
GOOGLE_APPLICATION_CREDENTIALS=""
|
73 |
-
|
74 |
-
# Google Gemini
|
75 |
-
GEMINI_API_KEY=""
|
76 |
-
|
77 |
-
...
|
78 |
-
|
79 |
-
# Kokoro TTS Settings
|
80 |
-
KOKORO_MODEL_PATH="models/kokoro-v0_19.onnx"
|
81 |
-
KOKORO_VOICES_PATH="models/voices.bin"
|
82 |
-
KOKORO_DEFAULT_VOICE="af"
|
83 |
-
KOKORO_DEFAULT_SPEED="1.0"
|
84 |
-
KOKORO_DEFAULT_LANG="en-us"
|
85 |
-
```
|
86 |
-
Fill in the API keys according to the model you wanted to use.
|
87 |
|
88 |
-
|
89 |
-
```shell
|
90 |
-
export PYTHONPATH=$(pwd):$PYTHONPATH
|
91 |
-
```
|
92 |
|
93 |
-
|
|
|
|
|
|
|
94 |
|
95 |
-
|
96 |
|
97 |
-
|
98 |
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
|
103 |
-
###
|
104 |
-
```shell
|
105 |
-
python generate_video.py \
|
106 |
-
--model "openai/o3-mini" \
|
107 |
-
--helper_model "openai/o3-mini" \
|
108 |
-
--output_dir "output/your_exp_name" \
|
109 |
-
--topic "your_topic" \
|
110 |
-
--context "description of your topic, e.g. 'This is a topic about the properties of a triangle'" \
|
111 |
-
```
|
112 |
|
113 |
-
|
114 |
-
|
115 |
-
|
116 |
-
--model "openai/o3-mini" \
|
117 |
-
--helper_model "openai/o3-mini" \
|
118 |
-
--output_dir "output/my_exp_name" \
|
119 |
-
--topic "Big O notation" \
|
120 |
-
--context "most common type of asymptotic notation in computer science used to measure worst case complexity" \
|
121 |
-
```
|
122 |
|
123 |
-
|
124 |
-
|
125 |
-
python generate_video.py \
|
126 |
-
--model "openai/o3-mini" \
|
127 |
-
--helper_model "openai/o3-mini" \
|
128 |
-
--output_dir "output/my_exp_name" \
|
129 |
-
--theorems_path data/thb_easy/math.json \
|
130 |
-
--max_scene_concurrency 7 \
|
131 |
-
--max_topic_concurrency 20 \
|
132 |
-
```
|
133 |
|
134 |
-
|
135 |
-
|
136 |
-
|
137 |
-
```shell
|
138 |
-
python generate_video.py \
|
139 |
-
--model "openai/o3-mini" \
|
140 |
-
--helper_model "openai/o3-mini" \
|
141 |
-
--output_dir "output/with_rag/o3-mini/vtutorbench_easy/math" \
|
142 |
-
--topic "Big O notation" \
|
143 |
-
--context "most common type of asymptotic notation in computer science used to measure worst case complexity" \
|
144 |
-
--use_rag \
|
145 |
-
--chroma_db_path "data/rag/chroma_db" \
|
146 |
-
--manim_docs_path "data/rag/manim_docs" \
|
147 |
-
--embedding_model "vertex_ai/text-embedding-005"
|
148 |
-
```
|
149 |
|
150 |
-
|
151 |
-
|
152 |
-
usage: generate_video.py [-h]
|
153 |
-
[--model]
|
154 |
-
[--topic TOPIC] [--context CONTEXT]
|
155 |
-
[--helper_model]
|
156 |
-
[--only_gen_vid] [--only_combine] [--peek_existing_videos] [--output_dir OUTPUT_DIR] [--theorems_path THEOREMS_PATH]
|
157 |
-
[--sample_size SAMPLE_SIZE] [--verbose] [--max_retries MAX_RETRIES] [--use_rag] [--use_visual_fix_code]
|
158 |
-
[--chroma_db_path CHROMA_DB_PATH] [--manim_docs_path MANIM_DOCS_PATH]
|
159 |
-
[--embedding_model {azure/text-embedding-3-large,vertex_ai/text-embedding-005}] [--use_context_learning]
|
160 |
-
[--context_learning_path CONTEXT_LEARNING_PATH] [--use_langfuse] [--max_scene_concurrency MAX_SCENE_CONCURRENCY]
|
161 |
-
[--max_topic_concurrency MAX_TOPIC_CONCURRENCY] [--debug_combine_topic DEBUG_COMBINE_TOPIC] [--only_plan] [--check_status]
|
162 |
-
[--only_render] [--scenes SCENES [SCENES ...]]
|
163 |
-
|
164 |
-
Generate Manim videos using AI
|
165 |
-
|
166 |
-
options:
|
167 |
-
-h, --help show this help message and exit
|
168 |
-
--model Select the AI model to use
|
169 |
-
--topic TOPIC Topic to generate videos for
|
170 |
-
--context CONTEXT Context of the topic
|
171 |
-
--helper_model Select the helper model to use
|
172 |
-
--only_gen_vid Only generate videos to existing plans
|
173 |
-
--only_combine Only combine videos
|
174 |
-
--peek_existing_videos, --peek
|
175 |
-
Peek at existing videos
|
176 |
-
--output_dir OUTPUT_DIR
|
177 |
-
Output directory
|
178 |
-
--theorems_path THEOREMS_PATH
|
179 |
-
Path to theorems json file
|
180 |
-
--sample_size SAMPLE_SIZE, --sample SAMPLE_SIZE
|
181 |
-
Number of theorems to sample
|
182 |
-
--verbose Print verbose output
|
183 |
-
--max_retries MAX_RETRIES
|
184 |
-
Maximum number of retries for code generation
|
185 |
-
--use_rag, --rag Use Retrieval Augmented Generation
|
186 |
-
--use_visual_fix_code, --visual_fix_code
|
187 |
-
Use VLM to fix code with rendered visuals
|
188 |
-
--chroma_db_path CHROMA_DB_PATH
|
189 |
-
Path to Chroma DB
|
190 |
-
--manim_docs_path MANIM_DOCS_PATH
|
191 |
-
Path to manim docs
|
192 |
-
--embedding_model {azure/text-embedding-3-large,vertex_ai/text-embedding-005}
|
193 |
-
Select the embedding model to use
|
194 |
-
--use_context_learning
|
195 |
-
Use context learning with example Manim code
|
196 |
-
--context_learning_path CONTEXT_LEARNING_PATH
|
197 |
-
Path to context learning examples
|
198 |
-
--use_langfuse Enable Langfuse logging
|
199 |
-
--max_scene_concurrency MAX_SCENE_CONCURRENCY
|
200 |
-
Maximum number of scenes to process concurrently
|
201 |
-
--max_topic_concurrency MAX_TOPIC_CONCURRENCY
|
202 |
-
Maximum number of topics to process concurrently
|
203 |
-
--debug_combine_topic DEBUG_COMBINE_TOPIC
|
204 |
-
Debug combine videos
|
205 |
-
--only_plan Only generate scene outline and implementation plans
|
206 |
-
--check_status Check planning and code status for all theorems
|
207 |
-
--only_render Only render scenes without combining videos
|
208 |
-
--scenes SCENES [SCENES ...]
|
209 |
-
Specific scenes to process (if theorems_path is provided)
|
210 |
```
|
211 |
|
212 |
-
|
213 |
-
|
214 |
-
|
215 |
-
|
216 |
-
|
217 |
-
|
218 |
-
|
219 |
-
|
220 |
-
|
221 |
-
|
222 |
-
|
223 |
-
[--eval_type {text,video,image,all}] --file_path FILE_PATH --output_folder OUTPUT_FOLDER [--retry_limit RETRY_LIMIT] [--combine] [--bulk_evaluate] [--target_fps TARGET_FPS]
|
224 |
-
[--use_parent_folder_as_topic] [--max_workers MAX_WORKERS]
|
225 |
-
|
226 |
-
Automatic evaluation of theorem explanation videos with LLMs
|
227 |
-
|
228 |
-
options:
|
229 |
-
-h, --help show this help message and exit
|
230 |
-
--model_text {gemini/gemini-1.5-pro-002,gemini/gemini-1.5-flash-002,gemini/gemini-2.0-flash-001,vertex_ai/gemini-1.5-flash-002,vertex_ai/gemini-1.5-pro-002,vertex_ai/gemini-2.0-flash-001,openai/o3-mini,gpt-4o,azure/gpt-4o,azure/gpt-4o-mini,bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0,bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0,bedrock/anthropic.claude-3-5-haiku-20241022-v1:0,bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0}
|
231 |
-
Select the AI model to use for text evaluation
|
232 |
-
--model_video {gemini/gemini-1.5-pro-002,gemini/gemini-2.0-flash-exp,gemini/gemini-2.0-pro-exp-02-05}
|
233 |
-
Select the AI model to use for video evaluation
|
234 |
-
--model_image {gemini/gemini-1.5-pro-002,gemini/gemini-1.5-flash-002,gemini/gemini-2.0-flash-001,vertex_ai/gemini-1.5-flash-002,vertex_ai/gemini-1.5-pro-002,vertex_ai/gemini-2.0-flash-001,openai/o3-mini,gpt-4o,azure/gpt-4o,azure/gpt-4o-mini,bedrock/anthropic.claude-3-5-sonnet-20240620-v1:0,bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0,bedrock/anthropic.claude-3-5-haiku-20241022-v1:0,bedrock/us.anthropic.claude-3-7-sonnet-20250219-v1:0}
|
235 |
-
Select the AI model to use for image evaluation
|
236 |
-
--eval_type {text,video,image,all}
|
237 |
-
Type of evaluation to perform
|
238 |
-
--file_path FILE_PATH
|
239 |
-
Path to a file or a theorem folder
|
240 |
-
--output_folder OUTPUT_FOLDER
|
241 |
-
Directory to store the evaluation files
|
242 |
-
--retry_limit RETRY_LIMIT
|
243 |
-
Number of retry attempts for each inference
|
244 |
-
--combine Combine all results into a single JSON file
|
245 |
-
--bulk_evaluate Evaluate a folder of theorems together
|
246 |
-
--target_fps TARGET_FPS
|
247 |
-
Target FPS for video processing. If not set, original video FPS will be used
|
248 |
-
--use_parent_folder_as_topic
|
249 |
-
Use parent folder name as topic name for single file evaluation
|
250 |
-
--max_workers MAX_WORKERS
|
251 |
-
Maximum number of concurrent workers for parallel processing
|
252 |
```
|
253 |
-
* For `file_path`, it is recommended to pass a folder containing both an MP4 file and an SRT file.
|
254 |
|
255 |
-
##
|
256 |
|
257 |
-
|
258 |
|
259 |
-
|
260 |
-
|
|
|
|
|
261 |
|
262 |
-
|
263 |
-
cd task_generator
|
264 |
-
python parse_prompt.py
|
265 |
-
cd ..
|
266 |
-
```
|
267 |
|
268 |
-
|
|
|
|
|
|
|
269 |
|
270 |
-
|
271 |
|
272 |
-
|
273 |
-
|
274 |
-
|
275 |
-
|
276 |
-
|
|
|
277 |
|
278 |
-
|
279 |
-
|
280 |
-
|
281 |
-
|
282 |
-
|
283 |
-
|
284 |
-
|
285 |
-
})
|
286 |
```
|
287 |
|
288 |
-
|
|
|
|
|
|
|
289 |
|
290 |
-
|
|
|
|
|
291 |
|
292 |
-
|
293 |
-
|
|
|
294 |
|
295 |
-
|
296 |
-
A: Check your Manim installation. <br>
|
297 |
|
298 |
-
|
299 |
-
|
|
|
|
|
300 |
|
301 |
-
|
302 |
-
A: It could be API-related issues. Make sure your `.env` file is properly configured (fill in your API keys), or you can enable litellm debug mode to figure out the issues. <be>
|
303 |
|
304 |
-
|
305 |
-
|
|
|
|
|
306 |
|
|
|
|
|
|
|
|
|
307 |
|
308 |
-
##
|
309 |
|
310 |
-
|
311 |
-
|
312 |
-
|
313 |
-
|
314 |
-
|
315 |
-
|
316 |
-
|
317 |
-
archivePrefix={arXiv},
|
318 |
-
primaryClass={cs.AI},
|
319 |
-
url={https://arxiv.org/abs/2502.19400},
|
320 |
-
}
|
321 |
-
```
|
322 |
|
323 |
-
|
324 |
|
325 |
-
|
|
|
|
|
|
|
326 |
|
327 |
-
##
|
328 |
|
329 |
-
|
330 |
|
331 |
-
##
|
332 |
|
333 |
-
|
|
|
|
|
|
|
334 |
|
335 |
-
|
336 |
-
* [Manim Community](https://www.manim.community/)
|
337 |
-
* [kokoro-manim-voiceover](https://github.com/xposed73/kokoro-manim-voiceover)
|
338 |
-
* [manim-physics](https://github.com/Matheart/manim-physics)
|
339 |
-
* [manim-Chemistry](https://github.com/UnMolDeQuimica/manim-Chemistry)
|
340 |
-
* [ManimML](https://github.com/helblazer811/ManimML)
|
341 |
-
* [manim-dsa](https://github.com/F4bbi/manim-dsa)
|
342 |
-
* [manim-circuit](https://github.com/Mr-FuzzyPenguin/manim-circuit)
|
343 |
|
344 |
-
|
|
|
|
|
|
|
345 |
|
346 |
-
|
347 |
|
348 |
-
|
|
|
1 |
+
---
|
2 |
+
title: AI Animation & Voice Studio
|
3 |
+
emoji: π¬
|
4 |
+
colorFrom: blue
|
5 |
+
colorTo: purple
|
6 |
+
sdk: docker
|
7 |
+
app_port: 7860
|
8 |
+
suggested_hardware: cpu-upgrade
|
9 |
+
suggested_storage: small
|
10 |
+
pinned: true
|
11 |
+
license: apache-2.0
|
12 |
+
short_description: "Create mathematical animations with AI-powered using Manim"
|
13 |
+
tags:
|
14 |
+
- text-to-speech
|
15 |
+
- animation
|
16 |
+
- mathematics
|
17 |
+
- manim
|
18 |
+
- ai-voice
|
19 |
+
- educational
|
20 |
+
- visualization
|
21 |
+
models:
|
22 |
+
- kokoro-onnx/kokoro-v0_19
|
23 |
+
datasets: []
|
24 |
+
startup_duration_timeout: 30m
|
25 |
+
fullWidth: true
|
26 |
+
header: default
|
27 |
+
disable_embedding: false
|
28 |
+
preload_from_hub: []
|
29 |
+
---
|
30 |
+
|
31 |
+
# AI Animation & Voice Studio π¬
|
32 |
+
|
33 |
+
A powerful application that combines AI-powered text-to-speech with mathematical animation generation using Manim and Kokoro TTS. Create stunning educational content with synchronized voice narration and mathematical visualizations.
|
34 |
+
|
35 |
+
## π Features
|
36 |
+
|
37 |
+
- **Text-to-Speech**: High-quality voice synthesis using Kokoro ONNX models
|
38 |
+
- **Mathematical Animations**: Create stunning mathematical visualizations with Manim
|
39 |
+
- **LaTeX Support**: Full LaTeX rendering capabilities with TinyTeX
|
40 |
+
- **Interactive Interface**: User-friendly Gradio web interface
|
41 |
+
- **Audio Processing**: Advanced audio manipulation with FFmpeg and SoX
|
42 |
+
|
43 |
+
## π οΈ Technology Stack
|
44 |
+
|
45 |
+
- **Frontend**: Gradio for interactive web interface
|
46 |
+
- **Backend**: Python with FastAPI/Flask
|
47 |
+
- **Animation**: Manim (Mathematical Animation Engine)
|
48 |
+
- **TTS**: Kokoro ONNX for text-to-speech synthesis
|
49 |
+
- **LaTeX**: TinyTeX for mathematical typesetting
|
50 |
+
- **Audio**: FFmpeg, SoX, PortAudio for audio processing
|
51 |
+
- **Deployment**: Docker container optimized for Hugging Face Spaces
|
52 |
+
|
53 |
+
## π¦ Models
|
54 |
+
|
55 |
+
This application uses the following pre-trained models:
|
56 |
+
|
57 |
+
- **Kokoro TTS**: `kokoro-v0_19.onnx` - High-quality neural text-to-speech model
|
58 |
+
- **Voice Models**: `voices.bin` - Voice embedding models for different speaker characteristics
|
59 |
+
|
60 |
+
Models are automatically downloaded during the Docker build process from the official releases.
|
61 |
+
|
62 |
+
## πββοΈ Quick Start
|
63 |
+
|
64 |
+
### Using Hugging Face Spaces
|
65 |
+
|
66 |
+
1. Visit the [Space](https://huggingface.co/spaces/your-username/ai-animation-voice-studio)
|
67 |
+
2. Wait for the container to load (initial startup may take 3-5 minutes due to model loading)
|
68 |
+
3. Upload your script or enter text directly
|
69 |
+
4. Choose animation settings and voice parameters
|
70 |
+
5. Generate your animated video with AI narration!
|
71 |
+
|
72 |
+
### Local Development
|
73 |
+
|
74 |
+
```bash
|
75 |
+
# Clone the repository
|
76 |
+
git clone https://huggingface.co/spaces/your-username/ai-animation-voice-studio
|
77 |
+
cd ai-animation-voice-studio
|
78 |
+
|
79 |
+
# Build the Docker image
|
80 |
+
docker build -t ai-animation-studio .
|
81 |
+
|
82 |
+
# Run the container
|
83 |
+
docker run -p 7860:7860 ai-animation-studio
|
84 |
+
```
|
85 |
|
86 |
+
Access the application at `http://localhost:7860`
|
87 |
|
88 |
+
### Environment Setup
|
|
|
|
|
|
|
89 |
|
90 |
+
Create a `.env` file with your configuration:
|
91 |
|
92 |
+
```env
|
93 |
+
# Application settings
|
94 |
+
DEBUG=false
|
95 |
+
MAX_WORKERS=4
|
96 |
|
97 |
+
# Model settings
|
98 |
+
MODEL_PATH=/app/models
|
99 |
+
CACHE_DIR=/tmp/cache
|
100 |
|
101 |
+
# Optional: API keys if needed
|
102 |
+
# OPENAI_API_KEY=your_key_here
|
103 |
+
```
|
104 |
|
105 |
+
## π― Usage Examples
|
106 |
|
107 |
+
### Basic Text-to-Speech
|
108 |
|
109 |
+
```python
|
110 |
+
# Example usage in your code
|
111 |
+
from src.tts import generate_speech
|
112 |
+
|
113 |
+
audio = generate_speech(
|
114 |
+
text="Hello, this is a test of the text-to-speech system",
|
115 |
+
voice="default",
|
116 |
+
speed=1.0
|
117 |
+
)
|
118 |
+
```
|
119 |
|
120 |
+
### Mathematical Animation
|
121 |
|
122 |
+
```python
|
123 |
+
# Example Manim scene
|
124 |
+
from manim import *
|
125 |
|
126 |
+
class Example(Scene):
|
127 |
+
def construct(self):
|
128 |
+
# Your animation code here
|
129 |
+
pass
|
|
|
130 |
```
|
131 |
|
132 |
+
## π Project Structure
|
133 |
+
|
134 |
+
```
|
135 |
+
βββ src/ # Source code
|
136 |
+
β βββ tts/ # Text-to-speech modules
|
137 |
+
β βββ manim_scenes/ # Manim animation scenes
|
138 |
+
β βββ utils/ # Utility functions
|
139 |
+
βββ models/ # Pre-trained models (auto-downloaded)
|
140 |
+
βββ output/ # Generated content output
|
141 |
+
βββ requirements.txt # Python dependencies
|
142 |
+
βββ Dockerfile # Container configuration
|
143 |
+
βββ gradio_app.py # Main application entry point
|
144 |
+
βββ README.md # This file
|
145 |
```
|
146 |
|
147 |
+
## βοΈ Configuration
|
148 |
|
149 |
+
### Docker Environment Variables
|
|
|
|
|
150 |
|
151 |
+
- `GRADIO_SERVER_NAME`: Server host (default: 0.0.0.0)
|
152 |
+
- `GRADIO_SERVER_PORT`: Server port (default: 7860)
|
153 |
+
- `PYTHONPATH`: Python path configuration
|
154 |
+
- `HF_HOME`: Hugging Face cache directory
|
155 |
|
156 |
+
### Application Settings
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
157 |
|
158 |
+
Modify settings in your `.env` file or through environment variables:
|
|
|
|
|
|
|
159 |
|
160 |
+
- Model parameters
|
161 |
+
- Audio quality settings
|
162 |
+
- Animation render settings
|
163 |
+
- Cache configurations
|
164 |
|
165 |
+
## π§ Development
|
166 |
|
167 |
+
### Prerequisites
|
168 |
|
169 |
+
- Docker and Docker Compose
|
170 |
+
- Python 3.12+
|
171 |
+
- Git
|
172 |
|
173 |
+
### Setting Up Development Environment
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
174 |
|
175 |
+
```bash
|
176 |
+
# Install dependencies locally for development
|
177 |
+
pip install -r requirements.txt
|
|
|
|
|
|
|
|
|
|
|
|
|
178 |
|
179 |
+
# Run tests (if available)
|
180 |
+
python -m pytest tests/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
181 |
|
182 |
+
# Format code
|
183 |
+
black .
|
184 |
+
isort .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
185 |
|
186 |
+
# Lint code
|
187 |
+
flake8 .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
188 |
```
|
189 |
|
190 |
+
### Building and Testing
|
191 |
+
|
192 |
+
```bash
|
193 |
+
# Build the Docker image
|
194 |
+
docker build -t your-app-name:dev .
|
195 |
+
|
196 |
+
# Test the container locally
|
197 |
+
docker run --rm -p 7860:7860 your-app-name:dev
|
198 |
+
|
199 |
+
# Check container health
|
200 |
+
docker run --rm your-app-name:dev python -c "import src; print('Import successful')"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
201 |
```
|
|
|
202 |
|
203 |
+
## π Performance & Hardware
|
204 |
|
205 |
+
### Recommended Specs for Hugging Face Spaces
|
206 |
|
207 |
+
- **Hardware**: `cpu-upgrade` (recommended for faster rendering)
|
208 |
+
- **Storage**: `small` (sufficient for models and temporary files)
|
209 |
+
- **Startup Time**: ~3-5 minutes (due to model loading and TinyTeX setup)
|
210 |
+
- **Memory Usage**: ~2-3GB during operation
|
211 |
|
212 |
+
### System Requirements
|
|
|
|
|
|
|
|
|
213 |
|
214 |
+
- **Memory**: Minimum 2GB RAM, Recommended 4GB+
|
215 |
+
- **CPU**: Multi-core processor recommended for faster animation rendering
|
216 |
+
- **Storage**: ~1.5GB for models and dependencies
|
217 |
+
- **Network**: Stable connection for initial model downloads
|
218 |
|
219 |
+
### Optimization Tips
|
220 |
|
221 |
+
- Models are cached after first download
|
222 |
+
- Gradio interface uses efficient streaming for large outputs
|
223 |
+
- Docker multi-stage builds minimize final image size
|
224 |
+
- TinyTeX installation is optimized for essential packages only
|
225 |
+
|
226 |
+
## π Troubleshooting
|
227 |
|
228 |
+
### Common Issues
|
229 |
+
|
230 |
+
**Build Failures**:
|
231 |
+
```bash
|
232 |
+
# Clear Docker cache if build fails
|
233 |
+
docker system prune -a
|
234 |
+
docker build --no-cache -t your-app-name .
|
|
|
235 |
```
|
236 |
|
237 |
+
**Model Download Issues**:
|
238 |
+
- Check internet connection
|
239 |
+
- Verify model URLs are accessible
|
240 |
+
- Models will be re-downloaded if corrupted
|
241 |
|
242 |
+
**Memory Issues**:
|
243 |
+
- Reduce batch sizes in configuration
|
244 |
+
- Monitor memory usage with `docker stats`
|
245 |
|
246 |
+
**Audio Issues**:
|
247 |
+
- Ensure audio drivers are properly installed
|
248 |
+
- Check PortAudio configuration
|
249 |
|
250 |
+
### Getting Help
|
|
|
251 |
|
252 |
+
1. Check the [Discussions](https://huggingface.co/spaces/your-username/ai-animation-voice-studio/discussions) tab
|
253 |
+
2. Review container logs in the Space settings
|
254 |
+
3. Enable debug mode in configuration
|
255 |
+
4. Report issues in the Community tab
|
256 |
|
257 |
+
### Common Configuration Issues
|
|
|
258 |
|
259 |
+
**Space Configuration**:
|
260 |
+
- Ensure `app_port: 7860` is set in README.md front matter
|
261 |
+
- Check that `sdk: docker` is properly configured
|
262 |
+
- Verify hardware suggestions match your needs
|
263 |
|
264 |
+
**Model Loading**:
|
265 |
+
- Models download automatically on first run
|
266 |
+
- Check Space logs for download progress
|
267 |
+
- Restart Space if models fail to load
|
268 |
|
269 |
+
## π€ Contributing
|
270 |
|
271 |
+
We welcome contributions! Please see our contributing guidelines:
|
272 |
+
|
273 |
+
1. Fork the repository
|
274 |
+
2. Create a feature branch
|
275 |
+
3. Make your changes
|
276 |
+
4. Add tests if applicable
|
277 |
+
5. Submit a pull request
|
|
|
|
|
|
|
|
|
|
|
278 |
|
279 |
+
### Code Style
|
280 |
|
281 |
+
- Follow PEP 8 for Python code
|
282 |
+
- Use Black for code formatting
|
283 |
+
- Add docstrings for functions and classes
|
284 |
+
- Include type hints where appropriate
|
285 |
|
286 |
+
## π License
|
287 |
|
288 |
+
This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
|
289 |
|
290 |
+
## π Acknowledgments
|
291 |
|
292 |
+
- [Manim Community](https://www.manim.community/) for the animation engine
|
293 |
+
- [Kokoro TTS](https://github.com/thewh1teagle/kokoro-onnx) for text-to-speech models
|
294 |
+
- [Gradio](https://gradio.app/) for the web interface framework
|
295 |
+
- [Hugging Face](https://huggingface.co/) for hosting and infrastructure
|
296 |
|
297 |
+
## π Contact
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
298 |
|
299 |
+
- **Author**: Your Name
|
300 |
+
- **Email**: [email protected]
|
301 |
+
- **GitHub**: [@your-username](https://github.com/your-username)
|
302 |
+
- **Hugging Face**: [@your-username](https://huggingface.co/your-username)
|
303 |
|
304 |
+
---
|
305 |
|
306 |
+
*Built with β€οΈ for the open-source community*
|