Crystalite: A Lightweight Transformer for Efficient Crystal Modeling Paper • 2604.02270 • Published 15 days ago • 1
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published 21 days ago • 50
view article Article SynthVision: Building a 110K Synthetic Medical VQA Dataset with Cross-Model Validation 24 days ago • 16
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published Jan 29 • 18
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 126
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published Jan 26 • 42
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jul 31, 2025 • 28
Inference Optimized Checkpoints (with Model Optimizer) Collection A collection of generative models quantized and optimized for inference with Model Optimizer. • 61 items • Updated 1 day ago • 139
ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning Paper • 2602.02192 • Published Feb 2 • 12
Surprisal Guided Selection Collection Training at test-time for kernel optimization • 2 items • Updated Feb 12 • 1
OpenSec: Incident Response Agent Calibration Collection OpenSec is a dual-control RL environment, dataset, and evaluation suite that measures agent calibration on incident response tasks. • 4 items • Updated Feb 12 • 1
Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation Paper • 2602.07670 • Published Feb 7 • 1
view article Article Where should test-time compute go? Surprisal-guided selection in verifiable environments Feb 7 • 1