REFINE-AF: A Task-Agnostic Framework to Align Language Models via Self-Generated Instructions using Reinforcement Learning from Automated Feedback Paper • 2505.06548 • Published 4 days ago • 26
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published 1 day ago • 53
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information Paper • 2505.06046 • Published 4 days ago • 11
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models Paper • 2505.02686 • Published 8 days ago • 12
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions Paper • 2505.06111 • Published 4 days ago • 19
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges Paper • 2505.04769 • Published 6 days ago • 7
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published 5 days ago • 64
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published 6 days ago • 127
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains Paper • 2505.03981 • Published 7 days ago • 14
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models Paper • 2505.02847 • Published 12 days ago • 24
On Path to Multimodal Generalist: General-Level and General-Bench Paper • 2505.04620 • Published 6 days ago • 71
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning Paper • 2505.04601 • Published 6 days ago • 18
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published 13 days ago • 20
ZeroSearch: Incentivize the Search Capability of LLMs without Searching Paper • 2505.04588 • Published 6 days ago • 55