AI_Avatar_Chat / MODEL_DOWNLOAD_GUIDE.md
bravedims
πŸ“‹ Add model download guides and helpers for TTS-only mode issue
c89ce9a
|
raw
history blame
2.39 kB
ο»Ώ# Alternative OmniAvatar Model Download Guide
## 🎯 Why You're Getting Only Audio Output
Your app is working correctly but running in **TTS-only mode** because the OmniAvatar-14B models are missing. The app gracefully falls back to audio-only generation when video models aren't available.
## πŸš€ Solutions to Enable Video Generation
### Option 1: Use Git to Download Models (If you have Git LFS)
# Create model directories
mkdir pretrained_models\Wan2.1-T2V-14B
mkdir pretrained_models\OmniAvatar-14B
mkdir pretrained_models\wav2vec2-base-960h
# Clone models (requires Git LFS)
git lfs clone https://huggingface.co/Wan-AI/Wan2.1-T2V-14B pretrained_models/Wan2.1-T2V-14B
git lfs clone https://huggingface.co/OmniAvatar/OmniAvatar-14B pretrained_models/OmniAvatar-14B
git lfs clone https://huggingface.co/facebook/wav2vec2-base-960h pretrained_models/wav2vec2-base-960h
### Option 2: Install Python and Run Setup Script
1. **Install Python** (if not already done):
- Download from: https://python.org/downloads/
- Or enable from Microsoft Store
- Make sure to check "Add to PATH" during installation
2. **Run the setup script**:
python setup_omniavatar.py
### Option 3: Manual Download from HuggingFace
Visit these URLs and download manually:
- https://huggingface.co/Wan-AI/Wan2.1-T2V-14B
- https://huggingface.co/OmniAvatar/OmniAvatar-14B
- https://huggingface.co/facebook/wav2vec2-base-960h
Extract to:
- pretrained_models/Wan2.1-T2V-14B/
- pretrained_models/OmniAvatar-14B/
- pretrained_models/wav2vec2-base-960h/
### Option 4: Use Windows Subsystem for Linux (WSL)
If you have WSL installed:
```bash
wsl
cd /mnt/c/path/to/your/project
python setup_omniavatar.py
```
## πŸ“Š Model Requirements
Total download size: ~30.36GB
- Wan2.1-T2V-14B: ~28GB (base text-to-video model)
- OmniAvatar-14B: ~2GB (avatar animation weights)
- wav2vec2-base-960h: ~360MB (audio encoder)
## πŸ” Verify Installation
After downloading, restart your app and check:
- The app should show "full functionality enabled" in logs
- API responses should return video URLs instead of just audio
- Gradio interface should show video output component
## πŸ’‘ Current Status
Your setup is working perfectly for TTS! Once the OmniAvatar models are downloaded, you'll get:
βœ… Audio-driven avatar videos
βœ… Adaptive body animation
βœ… Lip-sync accuracy
βœ… 480p video output