Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 7 days ago • 65
Running on CPU Upgrade 189 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 189 Visualize synthetic data experiments as an interactive bookshelf
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published 21 days ago • 96
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 261
Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents Paper • 2602.16855 • Published about 1 month ago • 50
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 148
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 191
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published Feb 5 • 347
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining Paper • 2602.07085 • Published Feb 6 • 187
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 • 88
Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published Jan 28 • 182
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published Jan 14 • 127