ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 2 days ago • 78 • 3
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 3 days ago • 152 • 3
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published 3 days ago • 126 • 3
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published 2 days ago • 79 • 7
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published 2 days ago • 104 • 2
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 2 days ago • 143 • 5
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling Paper • 2604.07209 • Published 3 days ago • 25 • 2
SEVerA: Verified Synthesis of Self-Evolving Agents Paper • 2603.25111 • Published 16 days ago • 24 • 3
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents Paper • 2604.04247 • Published 6 days ago • 24 • 3
MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 3 days ago • 28 • 3
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Paper • 2604.04746 • Published 3 days ago • 57 • 3
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published 5 days ago • 35 • 5
Beyond Accuracy: Unveiling Inefficiency Patterns in Tool-Integrated Reasoning Paper • 2604.05404 • Published 4 days ago • 38 • 4
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published 6 days ago • 49 • 4
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 4 days ago • 107 • 5
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 5 days ago • 222 • 8
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing Paper • 2604.04911 • Published 5 days ago • 32 • 3
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published 6 days ago • 43 • 3