Spaces:
Configuration error
Configuration error
# CosyVoice Version Information | |
## Current Version: v1.0-cosyvoice-300m | |
### Models Installed: | |
- CosyVoice-300M (Main model) | |
- CosyVoice-300M-SFT (Supervised Fine-Tuning) | |
- CosyVoice-300M-direct (Zero-shot inference) | |
- CosyVoice-ttsfrd (Required resources) | |
### Features: | |
- Multi-language TTS (Chinese, English, Japanese, Korean) | |
- Zero-shot voice cloning | |
- Cross-lingual synthesis | |
- GPU acceleration with RTX A5000 | |
### Performance: | |
- Generation speed: ~1x real-time | |
- Model loading: 5-10 seconds | |
- GPU: RTX A5000 (24GB VRAM) | |
### Known Issues: | |
- Chinese accent in English/Portuguese synthesis | |
- Model trained primarily on Chinese data | |
### Next Version: | |
- CosyVoice2-0.5B (downloading) | |
- Improved English pronunciation | |
- Lower latency (150ms) | |
- 30-50% reduction in pronunciation errors | |