The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published Nov 11 • 33
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed Paper • 2512.14067 • Published 10 days ago • 12
VOYAGER: A Training Free Approach for Generating Diverse Datasets using LLMs Paper • 2512.12072 • Published 14 days ago • 17
Improving Recursive Transformers with Mixture of LoRAs Paper • 2512.12880 • Published 12 days ago • 4
Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision Paper • 2512.15489 • Published 9 days ago • 6
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 11 days ago • 100
Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models Paper • 2512.13607 • Published 11 days ago • 26
Exploring MLLM-Diffusion Information Transfer with MetaCanvas Paper • 2512.11464 • Published 14 days ago • 12
SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder Paper • 2512.11749 • Published 14 days ago • 36
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Paper • 2512.10739 • Published 15 days ago • 45
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published 24 days ago • 32
FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring Paper • 2512.04390 • Published 23 days ago • 8
Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression Paper • 2512.05081 • Published 22 days ago • 30
JEPA as a Neural Tokenizer: Learning Robust Speech Representations with Density Adaptive Attention Paper • 2512.07168 • Published 18 days ago • 1
One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation Paper • 2512.07829 • Published 18 days ago • 21