view article Article How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day Dec 8, 2025 • 48
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 268
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published Nov 13, 2025 • 49
view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix Nov 3, 2025 • 54
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 177
view article Article Exploring Environments Hub: Your Language Model needs better (open) environments to learn Sep 4, 2025 • 29
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129
Tool Use Reasoning Collection A collection of tool use reasoning dataset in Hermes format • 5 items • Updated Jul 23, 2025 • 9
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 Aug 8, 2025 • 90
view article Article Training Large Language Models with Interpreter Feedback using WebAssembly Apr 3, 2025 • 14
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Paper • 2503.05592 • Published Mar 7, 2025 • 27
Training Language Models to Self-Correct via Reinforcement Learning Paper • 2409.12917 • Published Sep 19, 2024 • 140
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18, 2024 • 78
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 Aug 26, 2024 • 82
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 42