UltraVideo: High-Quality UHD 4K Video Dataset

🤓 Project | 📑 Paper | 🤗 Hugging Face (UltraVideo Dataset)) | 🤗 Hugging Face (UltraVideo-Long Dataset)) | 🤗 Hugging Face (UltraWan-1K/4K Weights)

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

🎋 Click below image to watch the 4K demo video.
🤓 First open-sourced UHD-4K/8K video datasets with comprehensive structured (10 types) captions.
🤓 Native 1K/4K videos generation by UltraWan.

TODO

Release UltraVideo-Short
Release UltraVideo-Long for long video generation and understanding.
Release structured caption by our PPL for Open-Sora-Plan.

Quickstart

Refer to DiffSynth-Studio/examples/wanvideo for environment preparation.

pip install diffsynth==1.1.7

Download Wan2.1-T2V-1.3B model using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download --repo-type model Wan-AI/Wan2.1-T2V-1.3B --local-dir ultrawan_weights/Wan2.1-T2V-1.3B --resume-download

Download UltraWan-1K/4K models using huggingface-cli:

huggingface-cli download --repo-type model APRIL-AIGC/UltraWan --local-dir ultrawan_weights/UltraWan --resume-download

Generate native 1K/4K videos.

==> one GPU
LoRA_1k: CUDA_VISIBLE_DEVICES=0 python infer.py --model_dir ultrawan_weights/Wan2.1-T2V-1.3B --model_path ultrawan_weights/UltraWan/ultrawan-1k.ckpt --mode lora --lora_alpha 0.25 --usp 0 --height 1088 --width 1920 --num_frames 81 --out_dir output/ultrawan-1k
LoRA_4k: CUDA_VISIBLE_DEVICES=0 python infer.py --model_dir ultrawan_weights/Wan2.1-T2V-1.3B --model_path ultrawan_weights/UltraWan/ultrawan-4k.ckpt --mode lora --lora_alpha 0.5 --usp 0 --height 2160 --width 3840 --num_frames 33 --out_dir output/ultrawan-4k

==> usp with 6 GPUs
LoRA_1k: CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 torchrun --standalone --nproc_per_node=6 infer.py --model_dir ultrawan_weights/Wan2.1-T2V-1.3B --model_path ultrawan_weights/UltraWan/ultrawan-1k.ckpt --mode lora --lora_alpha 0.25 --usp 1 --height 1088 --width 1920 --num_frames 81 --out_dir output/ultrawan-1k
LoRA_4k: CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 torchrun --standalone --nproc_per_node=6 infer.py --model_dir ultrawan_weights/Wan2.1-T2V-1.3B --model_path ultrawan_weights/UltraWan/ultrawan-4k.ckpt --mode lora --lora_alpha 0.5 --usp 1 --height 2160 --width 3840 --num_frames 33 --out_dir output/ultrawan-4k

Official Inference

==> one GPU
ori_1k: CUDA_VISIBLE_DEVICES=0 python infer.py --model_dir ultrawan_weights/Wan2.1-T2V-1.3B --mode full --usp 0 --height 1088 --width 1920 --num_frames 81 --out_dir output/ori-1k

==> usp with 6 GPUs
ori_1k: CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 torchrun --standalone --nproc_per_node=6 infer.py --model_dir ultrawan_weights/Wan2.1-T2V-1.3B --mode full --usp 0 --height 1088 --width 1920 --num_frames 81 --out_dir output/ori-1k

UltraVideo Dataset

Download UltraVideo dataset.

huggingface-cli download --repo-type dataset APRIL-AIGC/UltraVideo --local-dir ./UltraVideo --resume-download

Users must follow LICENSE_APRIL_LAB to use this dataset.

VBench-Style Prompts of UltraVideo

The used VBench-style prompts in UltraVideo in the paper for reference:assets/ultravideo_prompts_in_VBench_style.json

License Agreement

Users must follow LICENSE_APRIL_LAB to use UltraVideo dataset.
Users must follow Wan-Video/Wan2.1/LICENSE.txt to use Wan-related models.

Acknowledgements

We would like to thank the contributors to the Wan2.1, Qwen, umt5-xxl, diffusers and HuggingFace repositories, for their open researches.

Citation

If you find our work helpful, please cite us.

@article{ultravideo,
  title={UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions},
  author={Xue, Zhucun and Zhang, Jiangning and Hu, Teng and He, Haoyang and Chen, Yinan and Cai, Yuxuan and Wang, Yabiao and Wang, Chengjie and Liu, Yong and Li, Xiangtai and Tao, Dacheng}, 
  journal={arXiv preprint arXiv:2506.13691},
  year={2025}
}

Downloads last month: 11

Model tree for APRIL-AIGC/UltraWan

Base model

Wan-AI/Wan2.1-T2V-1.3B

Finetuned

(26)

this model

Spaces using APRIL-AIGC/UltraWan 3

Paper for APRIL-AIGC/UltraWan

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

Paper • 2506.13691 • Published Jun 16, 2025 • 2