GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 212
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published 23 days ago • 124
deepcogito/cogito-v2-preview-llama-109B-MoE Image-Text-to-Text • 109B • Updated 12 days ago • 1.28k • 31
Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful Paper • 2507.07101 • Published Jul 9 • 3