paralips / README.md
arsh121's picture
Fix Space configuration to match exact template
f098ae1
---
title: ParaLip Video Dubbing
emoji: 🎥
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.0.0"
app_file: app.py
pinned: false
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
# ParaLip Video Dubbing
This is a Hugging Face Space that provides video dubbing capabilities using the ParaLip model from the [ParaLip paper](https://arxiv.org/abs/2107.06831). The model can generate lip-synchronized videos in multiple languages.
[![arXiv](https://img.shields.io/badge/arXiv-Paper-blue.svg)](https://arxiv.org/abs/2107.06831)
[![GitHub Stars](https://img.shields.io/github/stars/Dianezzy/ParaLip?style=social)](https://github.com/Dianezzy/ParaLip)
## Features
- Upload any video file
- Select target language for dubbing
- Generate lip-synchronized dubbed videos
- Support for multiple languages (Spanish, French, German, Italian, Portuguese)
## How to Use
1. Upload a video file using the video upload interface
2. Select your desired target language from the dropdown menu
3. Click the "Dub Video" button
4. Wait for the processing to complete
5. Download the dubbed video
## Technical Details
The model uses a combination of:
- Video frame processing
- Lip movement prediction
- Language translation
- Audio synthesis
## Limitations
- Input videos should be clear and well-lit
- Face should be clearly visible in the video
- Processing time depends on video length
- Maximum video length: 5 minutes
## Model Information
This space uses the ParaLip model, which is trained on the TCD-TIMIT dataset. The model architecture is based on FastSpeech and includes:
- Transformer-based encoder-decoder
- Duration predictor
- Lip movement generator
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{https://doi.org/10.48550/arxiv.2107.06831,
doi = {10.48550/ARXIV.2107.06831},
url = {https://arxiv.org/abs/2107.06831},
author = {Liu, Jinglin and Zhu, Zhiying and Ren, Yi and Huang, Wencan and Huai, Baoxing and Yuan, Nicholas and Zhao, Zhou},
title = {Parallel and High-Fidelity Text-to-Lip Generation},
publisher = {arXiv},
year = {2021},
copyright = {arXiv.org perpetual, non-exclusive license}
}
```
## License
This project is licensed under the MIT License.
## Acknowledgments
- TCD-TIMIT dataset
- FastSpeech paper and implementation
- Hugging Face Spaces platform
- Original ParaLip implementation by [Dianezzy](https://github.com/Dianezzy/ParaLip)