Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,140 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<p align="center">
|
2 |
+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanPortrait/refs/heads/main/assets/pics/logo.png" height=100>
|
3 |
+
</p>
|
4 |
+
|
5 |
+
<div align="center">
|
6 |
+
<h2><font color="red"> HunyuanPortrait </font></center> <br> <center>Implicit Condition Control for Enhanced Portrait Animation</h2>
|
7 |
+
|
8 |
+
<a href='https://arxiv.org/abs/2503.18860'><img src='https://img.shields.io/badge/ArXiv-2503.18860-red'></a>
|
9 |
+
<a href='https://kkakkkka.github.io/HunyuanPortrait/'><img src='https://img.shields.io/badge/Project-Page-Green'></a>  [](https://github.com/Tencent-Hunyuan/HunyuanPortrait)
|
10 |
+
</div>
|
11 |
+
|
12 |
+
## π Requirements
|
13 |
+
* An NVIDIA 3090 GPU with CUDA support is required.
|
14 |
+
* The model is tested on a single 24G GPU.
|
15 |
+
* Tested operating system: Linux
|
16 |
+
|
17 |
+
## Installation
|
18 |
+
|
19 |
+
```bash
|
20 |
+
git clone https://github.com/Tencent-Hunyuan/HunyuanPortrait
|
21 |
+
pip3 install torch torchvision torchaudio
|
22 |
+
pip3 install -r requirements.txt
|
23 |
+
```
|
24 |
+
|
25 |
+
## Download
|
26 |
+
|
27 |
+
All models are stored in `pretrained_weights` by default:
|
28 |
+
```bash
|
29 |
+
pip3 install "huggingface_hub[cli]"
|
30 |
+
cd pretrained_weights
|
31 |
+
huggingface-cli download --resume-download stabilityai/stable-video-diffusion-img2vid-xt --local-dir . --include "*.json"
|
32 |
+
wget -c https://huggingface.co/LeonJoe13/Sonic/resolve/main/yoloface_v5m.pt
|
33 |
+
wget -c https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/resolve/main/vae/diffusion_pytorch_model.fp16.safetensors -P vae
|
34 |
+
wget -c https://huggingface.co/FoivosPar/Arc2Face/resolve/da2f1e9aa3954dad093213acfc9ae75a68da6ffd/arcface.onnx
|
35 |
+
huggingface-cli download --resume-download tencent/HunyuanPortrait --local-dir hyportrait
|
36 |
+
```
|
37 |
+
|
38 |
+
And the file structure is as follows:
|
39 |
+
```bash
|
40 |
+
.
|
41 |
+
βββ arcface.onnx
|
42 |
+
βββ hyportrait
|
43 |
+
β βββ dino.pth
|
44 |
+
β βββ expression.pth
|
45 |
+
β βββ headpose.pth
|
46 |
+
β βββ image_proj.pth
|
47 |
+
β βββ motion_proj.pth
|
48 |
+
β βββ pose_guider.pth
|
49 |
+
β βββ unet.pth
|
50 |
+
βββ scheduler
|
51 |
+
β βββ scheduler_config.json
|
52 |
+
βββ unet
|
53 |
+
β βββ config.json
|
54 |
+
βββ vae
|
55 |
+
β βββ config.json
|
56 |
+
β βββ diffusion_pytorch_model.fp16.safetensors
|
57 |
+
βββ yoloface_v5m.pt
|
58 |
+
```
|
59 |
+
|
60 |
+
## Run
|
61 |
+
|
62 |
+
π₯ Live your portrait by executing `bash demo.sh`
|
63 |
+
|
64 |
+
```bash
|
65 |
+
video_path="your_video.mp4"
|
66 |
+
image_path="your_image.png"
|
67 |
+
|
68 |
+
python inference.py \
|
69 |
+
--config config/hunyuan-portrait.yaml \
|
70 |
+
--video_path $video_path \
|
71 |
+
--image_path $image_path
|
72 |
+
```
|
73 |
+
|
74 |
+
## Framework
|
75 |
+
<img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanPortrait/refs/heads/main/assets/pics/pipeline.png">
|
76 |
+
|
77 |
+
## TL;DR:
|
78 |
+
HunyuanPortrait is a diffusion-based framework for generating lifelike, temporally consistent portrait animations by decoupling identity and motion using pre-trained encoders. It encodes driving video expressions/poses into implicit control signals, injects them via attention-based adapters into a stabilized diffusion backbone, enabling detailed and style-flexible animation from a single reference image. The method outperforms existing approaches in controllability and coherence.
|
79 |
+
|
80 |
+
# πΌ Gallery
|
81 |
+
|
82 |
+
Some results of portrait animation using HunyuanPortrait.
|
83 |
+
|
84 |
+
More results can be found on our [Project page](https://https://kkakkkka.github.io/HunyuanPortrait/).
|
85 |
+
|
86 |
+
## Cases
|
87 |
+
|
88 |
+
<table>
|
89 |
+
<tr>
|
90 |
+
<td width="25%">
|
91 |
+
|
92 |
+
https://github.com/user-attachments/assets/b234ab88-efd2-44dd-ae12-a160bdeab57e
|
93 |
+
|
94 |
+
</td>
|
95 |
+
<td width="25%">
|
96 |
+
|
97 |
+
https://github.com/user-attachments/assets/93631379-f3a1-4f5d-acd4-623a6287c39f
|
98 |
+
|
99 |
+
</td>
|
100 |
+
<td width="25%">
|
101 |
+
|
102 |
+
https://github.com/user-attachments/assets/95142e1c-b10f-4b88-9295-12df5090cc54
|
103 |
+
|
104 |
+
</td>
|
105 |
+
<td width="25%">
|
106 |
+
|
107 |
+
https://github.com/user-attachments/assets/bea095c7-9668-4cfd-a22d-36bf3689cd8a
|
108 |
+
|
109 |
+
</td>
|
110 |
+
</tr>
|
111 |
+
</table>
|
112 |
+
|
113 |
+
## Portrait Singing
|
114 |
+
|
115 |
+
https://github.com/user-attachments/assets/4b963f42-48b2-4190-8d8f-bbbe38f97ac6
|
116 |
+
|
117 |
+
## Portrait Acting
|
118 |
+
|
119 |
+
https://github.com/user-attachments/assets/48c8c412-7ff9-48e3-ac02-48d4c5a0633a
|
120 |
+
|
121 |
+
## Portrait Making Face
|
122 |
+
|
123 |
+
https://github.com/user-attachments/assets/bdd4c1db-ed90-4a24-a3c6-3ea0b436c227
|
124 |
+
|
125 |
+
## Acknowledgements
|
126 |
+
|
127 |
+
The code is based on [SVD](https://github.com/Stability-AI/generative-models), [DiNOv2](https://github.com/facebookresearch/dinov2), [Arc2Face](https://github.com/foivospar/Arc2Face), [YoloFace](https://github.com/deepcam-cn/yolov5-face). We thank the authors for their open-sourced code and encourage users to cite their works when applicable.
|
128 |
+
Stable Video Diffusion is licensed under the Stable Video Diffusion Research License, Copyright (c) Stability AI Ltd. All Rights Reserved.
|
129 |
+
This codebase is intended solely for academic purposes.
|
130 |
+
|
131 |
+
# πΌ Citation
|
132 |
+
If you think this project is helpful, please feel free to leave a starβοΈβοΈβοΈ and cite our paper:
|
133 |
+
```bibtex
|
134 |
+
@article{xu2025hunyuanportrait,
|
135 |
+
title={HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation},
|
136 |
+
author={Xu, Zunnan and Yu, Zhentao and Zhou, Zixiang and Zhou, Jun and Jin, Xiaoyu and Hong, Fa-Ting and Ji, Xiaozhong and Zhu, Junwei and Cai, Chengfei and Tang, Shiyu and Lin, Qin and Li, Xiu and Lu, Qinglin},
|
137 |
+
journal={arXiv preprint arXiv:2503.18860},
|
138 |
+
year={2025}
|
139 |
+
}
|
140 |
+
```
|