INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning Paper • 2505.07291 • Published 1 day ago • 2
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning Paper • 2504.11354 • Published 28 days ago • 4
view article Article From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate Jun 13, 2024 • 54
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published Mar 31 • 62
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published Mar 26 • 48
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper • 2503.18892 • Published Mar 24 • 30
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published Mar 20 • 48
Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning Paper • 2503.07572 • Published Mar 10 • 44