Spaces:
Build error
Build error
File size: 1,891 Bytes
7495571 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
# TTS Structure
This directory contains a Text-to-Speech (TTS) implementation that supports three specific models:
1. Kokoro: https://github.com/hexgrad/kokoro
2. Dia: https://github.com/nari-labs/dia
3. CosyVoice2: https://github.com/nari-labs/dia
## Structure
The TTS implementation follows a simple, clean structure:
- `tts.py`: Contains the base `TTSBase` abstract class and `DummyTTS` implementation
- `tts_kokoro.py`: Kokoro TTS implementation
- `tts_dia.py`: Dia TTS implementation
- `tts_cosyvoice2.py`: CosyVoice2 TTS implementation
- `tts_main.py`: Main entry point for TTS functionality
## Usage
```python
# Import the main TTS functions
from utils.tts_main import generate_speech, generate_speech_stream, get_tts_engine
# Generate speech using the best available engine
audio_path = generate_speech("Hello, world!")
# Generate speech using a specific engine
audio_path = generate_speech("Hello, world!", engine_type="kokoro")
# Generate speech with specific parameters
audio_path = generate_speech(
"Hello, world!",
engine_type="dia",
lang_code="en",
voice="default",
speed=1.0
)
# Generate speech stream
for sample_rate, audio_data in generate_speech_stream("Hello, world!"):
# Process audio data
pass
# Get a specific TTS engine instance
engine = get_tts_engine("kokoro")
audio_path = engine.generate_speech("Hello, world!")
```
## Error Handling
All TTS implementations include robust error handling:
1. Each implementation checks for the availability of its dependencies
2. If a specific engine fails, it automatically falls back to the `DummyTTS` implementation
3. The main module prioritizes engines based on availability
## Adding New Engines
To add a new TTS engine:
1. Create a new file `tts_<engine_name>.py`
2. Implement a class that inherits from `TTSBase`
3. Add the engine to the available engines list in `tts_main.py` |