Teaching language models to think efficiently with Adaptive Length Penalty (ALP)
AI & ML interests
Scaling up good synthetic reasoning. Post-training and synthetic data research lab.
Recent Activity
View all activity
This collection contains assets associated with the Big-Math dataset, a high-quality collection of over 250,000 math questions with verifiable answers
-
SynthLabsAI/Big-Math-RL-Verified
Viewer • Updated • 251k • 5.57k • 185 -
SynthLabsAI/Big-Math-RL-UNVERIFIED
Viewer • Updated • 34.9k • 19 • 1 -
nlile/NuminaMath-1.5-RL-Verifiable
Viewer • Updated • 131k • 6.57k • 5 -
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Paper • 2502.17387 • Published • 6
Teaching language models to think efficiently with Adaptive Length Penalty (ALP)
This collection contains assets associated with the Big-Math dataset, a high-quality collection of over 250,000 math questions with verifiable answers
-
SynthLabsAI/Big-Math-RL-Verified
Viewer • Updated • 251k • 5.57k • 185 -
SynthLabsAI/Big-Math-RL-UNVERIFIED
Viewer • Updated • 34.9k • 19 • 1 -
nlile/NuminaMath-1.5-RL-Verifiable
Viewer • Updated • 131k • 6.57k • 5 -
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Paper • 2502.17387 • Published • 6
Collection of various datasets related to the PERSONA paper.