UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding Paper • 2507.22025 • Published 16 days ago • 4
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use Paper • 2508.04482 • Published 8 days ago • 9
UserBench: An Interactive Gym Environment for User-Centric Agents Paper • 2507.22034 • Published 16 days ago • 26
3D-R1: Enhancing Reasoning in 3D VLMs for Unified Scene Understanding Paper • 2507.23478 • Published 14 days ago • 15
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation Paper • 2508.01126 • Published 12 days ago • 4
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper • 2508.01780 • Published 11 days ago • 13
OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets Paper • 2508.01630 • Published 11 days ago • 7
Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents Paper • 2508.01858 • Published 11 days ago • 20
Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published 9 days ago • 53
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience Paper • 2508.04700 • Published 8 days ago • 46
Efficient Agents: Building Effective Agents While Reducing Cost Paper • 2508.02694 • Published 21 days ago • 77
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning Paper • 2508.05405 • Published 7 days ago • 61
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation Paper • 2507.17520 • Published 22 days ago • 14
(Almost) Free Modality Stitching of Foundation Models Paper • 2507.10015 • Published about 1 month ago • 1
"PhyWorldBench": A Comprehensive Evaluation of Physical Realism in Text-to-Video Models Paper • 2507.13428 • Published 28 days ago • 15
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos Paper • 2507.15597 • Published 24 days ago • 33
Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory Paper • 2507.16713 • Published 23 days ago • 21
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published 23 days ago • 36