rubricreward/mR3-Qwen3-14B-tgt-prompt-tgt-thinking-translated
Text Generation
•
15B
•
Updated
•
10
rubricreward/mR3-Qwen3-8B-tgt-prompt-tgt-thinking-translated
Text Generation
•
8B
•
Updated
•
12
rubricreward/mR3-Qwen3-4B-tgt-prompt-tgt-thinking-translated
Text Generation
•
4B
•
Updated
•
10
rubricreward/mR3-Qwen3-14B-tgt-prompt-tgt-thinking
Text Generation
•
15B
•
Updated
•
6
rubricreward/mR3-Qwen3-8B-tgt-prompt-tgt-thinking
Text Generation
•
8B
•
Updated
•
8
rubricreward/mR3-Qwen3-4B-tgt-prompt-tgt-thinking
Text Generation
•
4B
•
Updated
•
11
rubricreward/mR3-Qwen3-4B-tgt-prompt-en-thinking
Text Generation
•
4B
•
Updated
•
11
rubricreward/mR3-Qwen3-8B-tgt-prompt-en-thinking
Text Generation
•
8B
•
Updated
•
8
rubricreward/mR3-Qwen3-14B-tgt-prompt-en-thinking
Text Generation
•
15B
•
Updated
•
16
rubricreward/mR3-Qwen3-14B-en-prompt-en-thinking
Text Generation
•
15B
•
Updated
•
14
rubricreward/mR3-Qwen3-4B-en-prompt-en-thinking
Text Generation
•
4B
•
Updated
•
22
rubricreward/mR3-Qwen3-8B-en-prompt-en-thinking
Text Generation
•
8B
•
Updated
•
10
rubricreward/mR3-gpt-oss-20b-en-prompt-en-thinking
335k
•
Updated
•
3
•
1
rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-14B-LoRA-4k
Text Generation
•
Updated
•
5
rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-8B-14k
Text Generation
•
Updated
•
5
rubricreward/LLaMA-3.2-3B-DPO-HelpSteer3-R3-Qwen3-4B-14k
Text Generation
•
Updated
•
5
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-LoRA-4k
15B
•
Updated
•
5
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-LoRA-14k
15B
•
Updated
•
5
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-14k
Text Generation
•
15B
•
Updated
•
7
rubricreward/R3-DeepSeek-R1-Distill-Qwen-14B-4k
Text Generation
•
15B
•
Updated
•
5
rubricreward/R3-Phi-4-reasoning-plus-LoRA-14k
15B
•
Updated
•
5
rubricreward/R3-Qwen3-14B-LoRA-14k
15B
•
Updated
•
4
rubricreward/R3-Qwen3-8B-LoRA-14k
Text Generation
•
8B
•
Updated
•
13
•
2
rubricreward/R3-Qwen3-4B-LoRA-14k
4B
•
Updated
•
6
rubricreward/R3-Qwen2.5-7B-LoRA-4k
8B
•
Updated
•
6
rubricreward/R3-Qwen2.5-7B-LoRA-14k
8B
•
Updated
•
5
rubricreward/R3-Qwen2.5-7B-14k
8B
•
Updated
•
5
rubricreward/R3-Qwen2.5-7B-4k
8B
•
Updated
•
7
•
1
rubricreward/R3-Qwen3-14B-LoRA-Random-Filter1
Updated
rubricreward/R3-Qwen3-14B-LoRA-Preference-Only-v1.1
15B
•
Updated
•
7
•
1