Spaces:

arsh121
/

paralips

Build error

App Files Files Community

paralips / README.md

arsh121

Fix Space configuration to match exact template

f098ae1 3 months ago

preview code

raw

history blame contribute delete

2.48 kB

	---
	title: ParaLip Video Dubbing
	emoji: 🎥
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: "4.0.0"
	app_file: app.py
	pinned: false
	---

	Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

	# ParaLip Video Dubbing

	This is a Hugging Face Space that provides video dubbing capabilities using the ParaLip model from the [ParaLip paper](https://arxiv.org/abs/2107.06831). The model can generate lip-synchronized videos in multiple languages.

	[![arXiv](https://img.shields.io/badge/arXiv-Paper-blue.svg)](https://arxiv.org/abs/2107.06831)
	[![GitHub Stars](https://img.shields.io/github/stars/Dianezzy/ParaLip?style=social)](https://github.com/Dianezzy/ParaLip)

	## Features

	- Upload any video file
	- Select target language for dubbing
	- Generate lip-synchronized dubbed videos
	- Support for multiple languages (Spanish, French, German, Italian, Portuguese)

	## How to Use

	1. Upload a video file using the video upload interface
	2. Select your desired target language from the dropdown menu
	3. Click the "Dub Video" button
	4. Wait for the processing to complete
	5. Download the dubbed video

	## Technical Details

	The model uses a combination of:
	- Video frame processing
	- Lip movement prediction
	- Language translation
	- Audio synthesis

	## Limitations

	- Input videos should be clear and well-lit
	- Face should be clearly visible in the video
	- Processing time depends on video length
	- Maximum video length: 5 minutes

	## Model Information

	This space uses the ParaLip model, which is trained on the TCD-TIMIT dataset. The model architecture is based on FastSpeech and includes:
	- Transformer-based encoder-decoder
	- Duration predictor
	- Lip movement generator

	## Citation

	If you use this model in your research, please cite:
	```bibtex
	@misc{https://doi.org/10.48550/arxiv.2107.06831,
	doi = {10.48550/ARXIV.2107.06831},
	url = {https://arxiv.org/abs/2107.06831},
	author = {Liu, Jinglin and Zhu, Zhiying and Ren, Yi and Huang, Wencan and Huai, Baoxing and Yuan, Nicholas and Zhao, Zhou},
	title = {Parallel and High-Fidelity Text-to-Lip Generation},
	publisher = {arXiv},
	year = {2021},
	copyright = {arXiv.org perpetual, non-exclusive license}
	}
	```

	## License

	This project is licensed under the MIT License.

	## Acknowledgments

	- TCD-TIMIT dataset
	- FastSpeech paper and implementation
	- Hugging Face Spaces platform
	- Original ParaLip implementation by [Dianezzy](https://github.com/Dianezzy/ParaLip)