kkakkkka commited on
Commit
e611d2a
Β·
verified Β·
1 Parent(s): 3ed87e2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -0
README.md ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center">
2
+ <img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanPortrait/refs/heads/main/assets/pics/logo.png" height=100>
3
+ </p>
4
+
5
+ <div align="center">
6
+ <h2><font color="red"> HunyuanPortrait </font></center> <br> <center>Implicit Condition Control for Enhanced Portrait Animation</h2>
7
+
8
+ <a href='https://arxiv.org/abs/2503.18860'><img src='https://img.shields.io/badge/ArXiv-2503.18860-red'></a>
9
+ <a href='https://kkakkkka.github.io/HunyuanPortrait/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> ![visitors](https://visitor-badge.laobi.icu/badge?page_id=Tencent-Hunyuan.HunyuanPortrait&left_color=green&right_color=red) [![GitHub](https://img.shields.io/github/stars/Tencent-Hunyuan/HunyuanPortrait?style=social)](https://github.com/Tencent-Hunyuan/HunyuanPortrait)
10
+ </div>
11
+
12
+ ## πŸ“œ Requirements
13
+ * An NVIDIA 3090 GPU with CUDA support is required.
14
+ * The model is tested on a single 24G GPU.
15
+ * Tested operating system: Linux
16
+
17
+ ## Installation
18
+
19
+ ```bash
20
+ git clone https://github.com/Tencent-Hunyuan/HunyuanPortrait
21
+ pip3 install torch torchvision torchaudio
22
+ pip3 install -r requirements.txt
23
+ ```
24
+
25
+ ## Download
26
+
27
+ All models are stored in `pretrained_weights` by default:
28
+ ```bash
29
+ pip3 install "huggingface_hub[cli]"
30
+ cd pretrained_weights
31
+ huggingface-cli download --resume-download stabilityai/stable-video-diffusion-img2vid-xt --local-dir . --include "*.json"
32
+ wget -c https://huggingface.co/LeonJoe13/Sonic/resolve/main/yoloface_v5m.pt
33
+ wget -c https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt/resolve/main/vae/diffusion_pytorch_model.fp16.safetensors -P vae
34
+ wget -c https://huggingface.co/FoivosPar/Arc2Face/resolve/da2f1e9aa3954dad093213acfc9ae75a68da6ffd/arcface.onnx
35
+ huggingface-cli download --resume-download tencent/HunyuanPortrait --local-dir hyportrait
36
+ ```
37
+
38
+ And the file structure is as follows:
39
+ ```bash
40
+ .
41
+ β”œβ”€β”€ arcface.onnx
42
+ β”œβ”€β”€ hyportrait
43
+ β”‚ β”œβ”€β”€ dino.pth
44
+ β”‚ β”œβ”€β”€ expression.pth
45
+ β”‚ β”œβ”€β”€ headpose.pth
46
+ β”‚ β”œβ”€β”€ image_proj.pth
47
+ β”‚ β”œβ”€β”€ motion_proj.pth
48
+ β”‚ β”œβ”€β”€ pose_guider.pth
49
+ β”‚ └── unet.pth
50
+ β”œβ”€β”€ scheduler
51
+ β”‚ └── scheduler_config.json
52
+ β”œβ”€β”€ unet
53
+ β”‚ └── config.json
54
+ β”œβ”€β”€ vae
55
+ β”‚ β”œβ”€β”€ config.json
56
+ β”‚ └── diffusion_pytorch_model.fp16.safetensors
57
+ └── yoloface_v5m.pt
58
+ ```
59
+
60
+ ## Run
61
+
62
+ πŸ”₯ Live your portrait by executing `bash demo.sh`
63
+
64
+ ```bash
65
+ video_path="your_video.mp4"
66
+ image_path="your_image.png"
67
+
68
+ python inference.py \
69
+ --config config/hunyuan-portrait.yaml \
70
+ --video_path $video_path \
71
+ --image_path $image_path
72
+ ```
73
+
74
+ ## Framework
75
+ <img src="https://raw.githubusercontent.com/Tencent-Hunyuan/HunyuanPortrait/refs/heads/main/assets/pics/pipeline.png">
76
+
77
+ ## TL;DR:
78
+ HunyuanPortrait is a diffusion-based framework for generating lifelike, temporally consistent portrait animations by decoupling identity and motion using pre-trained encoders. It encodes driving video expressions/poses into implicit control signals, injects them via attention-based adapters into a stabilized diffusion backbone, enabling detailed and style-flexible animation from a single reference image. The method outperforms existing approaches in controllability and coherence.
79
+
80
+ # πŸ–Ό Gallery
81
+
82
+ Some results of portrait animation using HunyuanPortrait.
83
+
84
+ More results can be found on our [Project page](https://https://kkakkkka.github.io/HunyuanPortrait/).
85
+
86
+ ## Cases
87
+
88
+ <table>
89
+ <tr>
90
+ <td width="25%">
91
+
92
+ https://github.com/user-attachments/assets/b234ab88-efd2-44dd-ae12-a160bdeab57e
93
+
94
+ </td>
95
+ <td width="25%">
96
+
97
+ https://github.com/user-attachments/assets/93631379-f3a1-4f5d-acd4-623a6287c39f
98
+
99
+ </td>
100
+ <td width="25%">
101
+
102
+ https://github.com/user-attachments/assets/95142e1c-b10f-4b88-9295-12df5090cc54
103
+
104
+ </td>
105
+ <td width="25%">
106
+
107
+ https://github.com/user-attachments/assets/bea095c7-9668-4cfd-a22d-36bf3689cd8a
108
+
109
+ </td>
110
+ </tr>
111
+ </table>
112
+
113
+ ## Portrait Singing
114
+
115
+ https://github.com/user-attachments/assets/4b963f42-48b2-4190-8d8f-bbbe38f97ac6
116
+
117
+ ## Portrait Acting
118
+
119
+ https://github.com/user-attachments/assets/48c8c412-7ff9-48e3-ac02-48d4c5a0633a
120
+
121
+ ## Portrait Making Face
122
+
123
+ https://github.com/user-attachments/assets/bdd4c1db-ed90-4a24-a3c6-3ea0b436c227
124
+
125
+ ## Acknowledgements
126
+
127
+ The code is based on [SVD](https://github.com/Stability-AI/generative-models), [DiNOv2](https://github.com/facebookresearch/dinov2), [Arc2Face](https://github.com/foivospar/Arc2Face), [YoloFace](https://github.com/deepcam-cn/yolov5-face). We thank the authors for their open-sourced code and encourage users to cite their works when applicable.
128
+ Stable Video Diffusion is licensed under the Stable Video Diffusion Research License, Copyright (c) Stability AI Ltd. All Rights Reserved.
129
+ This codebase is intended solely for academic purposes.
130
+
131
+ # 🎼 Citation
132
+ If you think this project is helpful, please feel free to leave a star⭐️⭐️⭐️ and cite our paper:
133
+ ```bibtex
134
+ @article{xu2025hunyuanportrait,
135
+ title={HunyuanPortrait: Implicit Condition Control for Enhanced Portrait Animation},
136
+ author={Xu, Zunnan and Yu, Zhentao and Zhou, Zixiang and Zhou, Jun and Jin, Xiaoyu and Hong, Fa-Ting and Ji, Xiaozhong and Zhu, Junwei and Cai, Chengfei and Tang, Shiyu and Lin, Qin and Li, Xiu and Lu, Qinglin},
137
+ journal={arXiv preprint arXiv:2503.18860},
138
+ year={2025}
139
+ }
140
+ ```