Spaces:

DroolingPanda
/

teachingAssistant

Sleeping

App Files Files Community

teachingAssistant / utils /tts_README.md

Michael Hu

refactor tts module

7495571 3 months ago

preview code

raw

history blame

1.89 kB

	# TTS Structure

	This directory contains a Text-to-Speech (TTS) implementation that supports three specific models:

	1. Kokoro: https://github.com/hexgrad/kokoro
	2. Dia: https://github.com/nari-labs/dia
	3. CosyVoice2: https://github.com/nari-labs/dia

	## Structure

	The TTS implementation follows a simple, clean structure:

	- `tts.py`: Contains the base `TTSBase` abstract class and `DummyTTS` implementation
	- `tts_kokoro.py`: Kokoro TTS implementation
	- `tts_dia.py`: Dia TTS implementation
	- `tts_cosyvoice2.py`: CosyVoice2 TTS implementation
	- `tts_main.py`: Main entry point for TTS functionality

	## Usage

	```python
	# Import the main TTS functions
	from utils.tts_main import generate_speech, generate_speech_stream, get_tts_engine

	# Generate speech using the best available engine
	audio_path = generate_speech("Hello, world!")

	# Generate speech using a specific engine
	audio_path = generate_speech("Hello, world!", engine_type="kokoro")

	# Generate speech with specific parameters
	audio_path = generate_speech(
	"Hello, world!",
	engine_type="dia",
	lang_code="en",
	voice="default",
	speed=1.0
	)

	# Generate speech stream
	for sample_rate, audio_data in generate_speech_stream("Hello, world!"):
	# Process audio data
	pass

	# Get a specific TTS engine instance
	engine = get_tts_engine("kokoro")
	audio_path = engine.generate_speech("Hello, world!")
	```

	## Error Handling

	All TTS implementations include robust error handling:

	1. Each implementation checks for the availability of its dependencies
	2. If a specific engine fails, it automatically falls back to the `DummyTTS` implementation
	3. The main module prioritizes engines based on availability

	## Adding New Engines

	To add a new TTS engine:

	1. Create a new file `tts_<engine_name>.py`
	2. Implement a class that inherits from `TTSBase`
	3. Add the engine to the available engines list in `tts_main.py`