CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning Paper • 2507.14111 • Published 27 days ago • 22
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 128
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27 • 71
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 503