From Word to World: Can Large Language Models be Implicit Text-based World Models? Paper • 2512.18832 • Published 5 days ago • 9
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios Paper • 2512.18470 • Published 6 days ago • 8
Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations Paper • 2512.21004 • Published 2 days ago • 10
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior Paper • 2512.20757 • Published 3 days ago • 14
DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation Paper • 2512.21252 • Published 2 days ago • 30
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published 3 days ago • 23
HiStream: Efficient High-Resolution Video Generation via Redundancy-Eliminated Streaming Paper • 2512.21338 • Published 2 days ago • 15
T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation Paper • 2512.21094 • Published 2 days ago • 24
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 9 days ago • 63
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 3 days ago • 45
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published 8 days ago • 20
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published 4 days ago • 59