CodeGoat24
/

UnifiedReward-qwen-3b

Model card Files Files and versions Community

CodeGoat24 commited on May 19

Commit

23875a5

·

verified ·

1 Parent(s): 7a0d339

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -19,7 +19,7 @@ We are actively gathering feedback from the community to improve our models. **W
 ## Model Summary
-`UnifiedReward-qwen-3b` is the first unified reward model based on [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) for multimodal understanding and generation assessment, enabling both pairwise ranking and pointwise scoring, which can be employed for vision model preference alignment.
 For further details, please refer to the following resources:
 - 📰 Paper: https://arxiv.org/pdf/2503.05236

 ## Model Summary
+`UnifiedReward-qwen-3b` is the first unified reward model based on [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) for multimodal understanding and generation assessment, enabling both pairwise ranking and pointwise scoring, which can be employed for vision model preference alignment.
 For further details, please refer to the following resources:
 - 📰 Paper: https://arxiv.org/pdf/2503.05236