Abstract
Recent advances in continuous generative models, including multi-step approaches like diffusion and flow-matching (typically requiring 8-1000 sampling steps) and few-step methods such as consistency models (typically 1-8 steps), have demonstrated impressive generative performance. However, existing work often treats these approaches as distinct paradigms, resulting in separate training and sampling methodologies. We introduce a unified framework for training, sampling, and analyzing these models. Our implementation, the Unified Continuous Generative Models Trainer and Sampler (UCGM-{T,S}), achieves state-of-the-art (SOTA) performance. For example, on ImageNet 256x256 using a 675M diffusion transformer, UCGM-T trains a multi-step model achieving 1.30 FID in 20 steps and a few-step model reaching 1.42 FID in just 2 steps. Additionally, applying UCGM-S to a pre-trained model (previously 1.26 FID at 250 steps) improves performance to 1.06 FID in only 40 steps. Code is available at: https://github.com/LINs-lab/UCGM.
Community
We have introduced a unified framework (UCGM) for training, sampling, and analyzing both multi-step models like diffusion and flow-matching, as well as few-step methods such as consistency models.
Notably, we achieve state-of-the-art (SOTA) performance on ImageNet 256x256 (1.06 FID with 40 sampling steps, 1.42 FID with 2 sampling steps) and ImageNet 512x512 (1.24 FID with 150 sampling steps, 1.75 FID with 2 sampling steps)!
We have introduced a unified framework (UCGM) for training, sampling, and analyzing both multi-step models like diffusion and flow-matching, as well as few-step methods such as consistency models.
Notably, we achieve state-of-the-art (SOTA) performance on ImageNet 256x256 (1.06 FID with 40 sampling steps, 1.42 FID with 2 sampling steps) and ImageNet 512x512 (1.24 FID with 150 sampling steps, 1.75 FID with 2 sampling steps)!
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper