Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation Paper • 2507.05963 • Published Jul 8 • 12 • 2
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement Paper • 2506.07848 • Published Jun 9 • 4 • 2
UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published Jun 3 • 58 • 2
MAGREF: Masked Guidance for Any-Reference Video Generation Paper • 2505.23742 • Published May 29 • 9 • 2
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published May 26 • 54 • 3
ImgEdit: A Unified Image Editing Dataset and Benchmark Paper • 2505.20275 • Published May 26 • 18 • 3
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published May 26 • 54 • 3
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published May 7 • 36 • 3
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published Apr 17 • 52 • 3