Spaces:
Sleeping
Sleeping
| # TTS Structure | |
| This directory contains a Text-to-Speech (TTS) implementation that supports three specific models: | |
| 1. Kokoro: https://github.com/hexgrad/kokoro | |
| 2. Dia: https://github.com/nari-labs/dia | |
| 3. CosyVoice2: https://github.com/nari-labs/dia | |
| ## Structure | |
| The TTS implementation follows a simple, clean structure: | |
| - `tts.py`: Contains the base `TTSBase` abstract class and `DummyTTS` implementation | |
| - `tts_kokoro.py`: Kokoro TTS implementation | |
| - `tts_dia.py`: Dia TTS implementation | |
| - `tts_cosyvoice2.py`: CosyVoice2 TTS implementation | |
| - `tts_main.py`: Main entry point for TTS functionality | |
| ## Usage | |
| ```python | |
| # Import the main TTS functions | |
| from utils.tts_main import generate_speech, generate_speech_stream, get_tts_engine | |
| # Generate speech using the best available engine | |
| audio_path = generate_speech("Hello, world!") | |
| # Generate speech using a specific engine | |
| audio_path = generate_speech("Hello, world!", engine_type="kokoro") | |
| # Generate speech with specific parameters | |
| audio_path = generate_speech( | |
| "Hello, world!", | |
| engine_type="dia", | |
| lang_code="en", | |
| voice="default", | |
| speed=1.0 | |
| ) | |
| # Generate speech stream | |
| for sample_rate, audio_data in generate_speech_stream("Hello, world!"): | |
| # Process audio data | |
| pass | |
| # Get a specific TTS engine instance | |
| engine = get_tts_engine("kokoro") | |
| audio_path = engine.generate_speech("Hello, world!") | |
| ``` | |
| ## Error Handling | |
| All TTS implementations include robust error handling: | |
| 1. Each implementation checks for the availability of its dependencies | |
| 2. If a specific engine fails, it automatically falls back to the `DummyTTS` implementation | |
| 3. The main module prioritizes engines based on availability | |
| ## Adding New Engines | |
| To add a new TTS engine: | |
| 1. Create a new file `tts_<engine_name>.py` | |
| 2. Implement a class that inherits from `TTSBase` | |
| 3. Add the engine to the available engines list in `tts_main.py` |