Update README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ We are actively gathering feedback from the community to improve our models. **W
|
|
19 |
|
20 |
## Model Summary
|
21 |
|
22 |
-
`UnifiedReward-qwen-3b` is the first unified reward model based on [Qwen/Qwen2.5-VL-
|
23 |
|
24 |
For further details, please refer to the following resources:
|
25 |
- 📰 Paper: https://arxiv.org/pdf/2503.05236
|
|
|
19 |
|
20 |
## Model Summary
|
21 |
|
22 |
+
`UnifiedReward-qwen-3b` is the first unified reward model based on [Qwen/Qwen2.5-VL-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct) for multimodal understanding and generation assessment, enabling both pairwise ranking and pointwise scoring, which can be employed for vision model preference alignment.
|
23 |
|
24 |
For further details, please refer to the following resources:
|
25 |
- 📰 Paper: https://arxiv.org/pdf/2503.05236
|