James Zhou commited on
Commit
f58903b
·
1 Parent(s): 2e936cf

[update] readme

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -18,7 +18,7 @@ extra_gated_eu_disallowed: true
18
 
19
  <h1>🎬 HunyuanVideo-Foley </h1>
20
 
21
- <h3>Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation</h3>
22
 
23
  <p align="center">
24
  <strong>Professional-grade AI sound effect generation for video content creators</strong>
@@ -107,7 +107,7 @@ Professional-grade audio generation with crystal clarity
107
 
108
  <div align="center" style="background: linear-gradient(135deg, #ffeef8 0%, #f0f8ff 100%); padding: 30px; border-radius: 20px; margin: 20px 0; border-left: 5px solid #ff6b9d; color: #333;">
109
 
110
- **🚀 Tencent Hunyuan** proudly open-sources **HunyuanVideo-Foley** - an end-to-end video sound effect generation model!
111
 
112
  *A professional-grade AI tool specifically designed for video content creators, widely applicable to diverse scenarios including short video creation, film production, advertising creativity, and game development.*
113
 
@@ -217,7 +217,7 @@ The **TV2A (Text-Video-to-Audio)** task presents a complex multimodal generation
217
  | Frieren | 5.71 | 2.81 | 3.47 | 5.31 | 0.18 | 1.39 | 0.16 | 2.92±0.95 | 2.76±1.20 | 2.94±1.26 |
218
  | MMAudio | 6.17 | 2.84 | 3.59 | 5.62 | 0.27 | 0.80 | 0.35 | 3.58±0.84 | 3.63±1.00 | 3.47±1.03 |
219
  | ThinkSound | 6.04 | 3.73 | 3.81 | 5.59 | 0.18 | 0.91 | 0.20 | 3.20±0.97 | 3.01±1.04 | 3.02±1.08 |
220
- | **🥇 HiFi-Foley (ours)** | **🟢 6.59** | **🟢 2.74** | **🟢 3.88** | **🟢 6.13** | **🟢 0.35** | **🟢 0.74** | **🟢 0.33** | **🟢 4.14±0.68** | **🟢 4.12±0.77** | **🟢 4.15±0.75** |
221
 
222
  </div>
223
 
@@ -239,7 +239,7 @@ The **TV2A (Text-Video-to-Audio)** task presents a complex multimodal generation
239
  | Frieren | 16.86 | 293.57 | 2.95 | 7.32 | 5.72 | 2.55 | 2.88 | 5.10 | 0.21 | 0.86 | 0.16 |
240
  | MMAudio | 9.01 | 205.85 | 2.17 | 9.59 | 5.94 | 2.91 | 3.30 | 5.39 | 0.30 | 0.56 | 0.27 |
241
  | ThinkSound | 9.92 | 228.68 | 2.39 | 6.86 | 5.78 | 3.23 | 3.12 | 5.11 | 0.22 | 0.67 | 0.22 |
242
- | **🥇 HiFi-Foley (ours)** | **🟢 6.07** | **🟢 202.12** | **🟢 1.89** | **🟢 8.30** | **🟢 6.12** | **🟢 2.76** | **🟢 3.22** | **🟢 5.53** | **🟢 0.38** | **🟢 0.54** | **🟢 0.24** |
243
 
244
  </div>
245
 
 
18
 
19
  <h1>🎬 HunyuanVideo-Foley </h1>
20
 
21
+ <h4>Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation</h4>
22
 
23
  <p align="center">
24
  <strong>Professional-grade AI sound effect generation for video content creators</strong>
 
107
 
108
  <div align="center" style="background: linear-gradient(135deg, #ffeef8 0%, #f0f8ff 100%); padding: 30px; border-radius: 20px; margin: 20px 0; border-left: 5px solid #ff6b9d; color: #333;">
109
 
110
+ **🚀 Tencent Hunyuan** open-sources **HunyuanVideo-Foley** an end-to-end video sound effect generation model!
111
 
112
  *A professional-grade AI tool specifically designed for video content creators, widely applicable to diverse scenarios including short video creation, film production, advertising creativity, and game development.*
113
 
 
217
  | Frieren | 5.71 | 2.81 | 3.47 | 5.31 | 0.18 | 1.39 | 0.16 | 2.92±0.95 | 2.76±1.20 | 2.94±1.26 |
218
  | MMAudio | 6.17 | 2.84 | 3.59 | 5.62 | 0.27 | 0.80 | 0.35 | 3.58±0.84 | 3.63±1.00 | 3.47±1.03 |
219
  | ThinkSound | 6.04 | 3.73 | 3.81 | 5.59 | 0.18 | 0.91 | 0.20 | 3.20±0.97 | 3.01±1.04 | 3.02±1.08 |
220
+ | **HunyuanVideo-Foley (ours)** | **6.59** | **2.74** | **3.88** | **6.13** | **0.35** | **0.74** | **0.33** | **4.14±0.68** | **4.12±0.77** | **4.15±0.75** |
221
 
222
  </div>
223
 
 
239
  | Frieren | 16.86 | 293.57 | 2.95 | 7.32 | 5.72 | 2.55 | 2.88 | 5.10 | 0.21 | 0.86 | 0.16 |
240
  | MMAudio | 9.01 | 205.85 | 2.17 | 9.59 | 5.94 | 2.91 | 3.30 | 5.39 | 0.30 | 0.56 | 0.27 |
241
  | ThinkSound | 9.92 | 228.68 | 2.39 | 6.86 | 5.78 | 3.23 | 3.12 | 5.11 | 0.22 | 0.67 | 0.22 |
242
+ | **HunyuanVideo-Foley (ours)** | **6.07** | **202.12** | **1.89** | **8.30** | **6.12** | **2.76** | **3.22** | **5.53** | **0.38** | **0.54** | **0.24** |
243
 
244
  </div>
245