Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 261
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 133
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 169
A Survey of Quantization Methods for Efficient Neural Network Inference Paper • 2103.13630 • Published Mar 25, 2021 • 1
Running 2.81k 2.81k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters