cosyvoice / VERSION.md
Marcos Remar
Add version information for v1.0-cosyvoice-300m
9bbdeda
# CosyVoice Version Information
## Current Version: v1.0-cosyvoice-300m
### Models Installed:
- CosyVoice-300M (Main model)
- CosyVoice-300M-SFT (Supervised Fine-Tuning)
- CosyVoice-300M-direct (Zero-shot inference)
- CosyVoice-ttsfrd (Required resources)
### Features:
- Multi-language TTS (Chinese, English, Japanese, Korean)
- Zero-shot voice cloning
- Cross-lingual synthesis
- GPU acceleration with RTX A5000
### Performance:
- Generation speed: ~1x real-time
- Model loading: 5-10 seconds
- GPU: RTX A5000 (24GB VRAM)
### Known Issues:
- Chinese accent in English/Portuguese synthesis
- Model trained primarily on Chinese data
### Next Version:
- CosyVoice2-0.5B (downloading)
- Improved English pronunciation
- Lower latency (150ms)
- 30-50% reduction in pronunciation errors