UnifiedReward 1.0 LLaVA Model
Collection
10 items
β’
Updated
This repository contains T2V-Turbo and its reference model for DPO based on our UnifiedReward-7B.
For further details, please refer to the following resources:
You can refer to Github for detailed usage.
@article{unifiedreward,
title={Unified reward model for multimodal understanding and generation},
author={Wang, Yibin and Zang, Yuhang and Li, Hao and Jin, Cheng and Wang, Jiaqi},
journal={arXiv preprint arXiv:2503.05236},
year={2025}
}