VGGT (CVPR 2025)
Generate stable LEGO structures from text prompts.
A Step Towards Music Generation Foundation Model
Generate static files for spaces
One framework to Generate multi-view images without Lora
View and compare pass@k metrics for AI models
A Unified Framework for Image Customization
Generate an edited image based on text instructions
Generate realistic talking video from an image and audio
Why you need it, how to get it
Scalable and Versatile 3D Generation from images
Universal Image Editing is worth a single LoRA
plug-and-play with visual concepts
Edit an image based on the given instruction.
Generate audio and video from text prompts