zhoujun commited on
Commit
cde4bdc
·
verified ·
1 Parent(s): 8fd1320

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -6,9 +6,9 @@ license: cc-by-nc-4.0
6
 
7
  This repository contains the Guru-32B (base Qwen2.5-32B) model presented in [Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective](https://huggingface.co/papers/2506.14965).
8
 
9
- The score we evaluate with temperature=1.0, top_p=0.7.
10
 
11
- ![Leaderboard](./figures/leaderboard.png)
12
 
13
 
14
- Please refer to the paper for more details.
 
6
 
7
  This repository contains the Guru-32B (base Qwen2.5-32B) model presented in [Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective](https://huggingface.co/papers/2506.14965).
8
 
9
+ The leaderboard is evaluated with our evaluation [code](https://github.com/LLM360/Reasoning360/tree/main/scripts/offline_eval). The temperature=1.0, top_p=0.7.
10
 
11
+ ![Leaderboard](.leaderboard.png)
12
 
13
 
14
+ Please refer to the [paper](https://arxiv.org/abs/2506.14965) for more details.