Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models Paper • 2505.02686 • Published 9 days ago • 12
view article Article LeRobot Community Datasets: The “ImageNet” of Robotics — When and How? 3 days ago • 44
On Path to Multimodal Generalist: General-Level and General-Bench Paper • 2505.04620 • Published 6 days ago • 71
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published 5 days ago • 65
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published 6 days ago • 129
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains Paper • 2505.03981 • Published 7 days ago • 14
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18 • 126