Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published 16 days ago • 39
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published Oct 20, 2025 • 122
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6, 2025 • 129
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 180
Scaling Speculative Decoding with Lookahead Reasoning Paper • 2506.19830 • Published Jun 24, 2025 • 12
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2, 2025 • 187
Faster Video Diffusion with Trainable Sparse Attention Paper • 2505.13389 • Published May 19, 2025 • 37
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published Apr 15, 2025 • 63
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 1 day ago • 126
Efficiently Serving LLM Reasoning Programs with Certaindex Paper • 2412.20993 • Published Dec 30, 2024 • 36
Transformers compatible Mamba Collection This release includes the `mamba` repositories compatible with the `transformers` library • 5 items • Updated Mar 6, 2024 • 39