cosyvoice / VERSION.md
Marcos Remar
Add version information for v1.0-cosyvoice-300m
9bbdeda

CosyVoice Version Information

Current Version: v1.0-cosyvoice-300m

Models Installed:

  • CosyVoice-300M (Main model)
  • CosyVoice-300M-SFT (Supervised Fine-Tuning)
  • CosyVoice-300M-direct (Zero-shot inference)
  • CosyVoice-ttsfrd (Required resources)

Features:

  • Multi-language TTS (Chinese, English, Japanese, Korean)
  • Zero-shot voice cloning
  • Cross-lingual synthesis
  • GPU acceleration with RTX A5000

Performance:

  • Generation speed: ~1x real-time
  • Model loading: 5-10 seconds
  • GPU: RTX A5000 (24GB VRAM)

Known Issues:

  • Chinese accent in English/Portuguese synthesis
  • Model trained primarily on Chinese data

Next Version:

  • CosyVoice2-0.5B (downloading)
  • Improved English pronunciation
  • Lower latency (150ms)
  • 30-50% reduction in pronunciation errors