flyingbugs/granite3.3-8b-reinforce_plus-math_different_reward_global_step60_hf Text Generation • 8B • Updated 10 days ago • 16
flyingbugs/granite3.3-8b-math-pku-rlhf-reinforce-plus Text Generation • 8B • Updated 17 days ago • 10
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.5-end-start-0.5-add-aime Text Generation • 8B • Updated May 13 • 5
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-random-perturbation-head Text Generation • 8B • Updated May 10 • 4
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-random-perturbation-full Text Generation • 8B • Updated May 10 • 5
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-random-perturbation-tail Text Generation • 8B • Updated May 10 • 5
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-keep-0.5-end-start-0.5-random-perturbation Text Generation • 8B • Updated May 10 • 5
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.75-end-start-0.0 Text Generation • 8B • Updated May 10 • 4
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-random-perturbation-middle Text Generation • 8B • Updated May 10 • 7
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-think-mid Text Generation • 8B • Updated May 10 • 5
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.75-end-start-0.5 Text Generation • 8B • Updated May 8 • 3
flyingbugs/Qwen2.5-Math-7B-Instruct-OpenR1-Math-220k-pruned-keep-0.75-end-start-1.0 Text Generation • 8B • Updated May 8 • 2
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.01-end-start-1.0 Text Generation • 8B • Updated May 7 • 3
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.01-end-start-0.0 Text Generation • 8B • Updated May 7 • 3
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.1-end-start-1.0 Text Generation • 8B • Updated May 7 • 6
flyingbugs/Qwen2.5-Math-7B-OpenR1-Math-220k-pruned-keep-0.1-end-start-0.0 Text Generation • 8B • Updated May 7 • 5