Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B Viewer • Updated Jan 27 • 250k • 456 • 96
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published 7 days ago • 124
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published 29 days ago • 255
view post Post 2908 this paper has been blowing upthey train an open-source multimodal LLM (InternVL3) that can compete with GPT-4o and Claude 3.5 Sonnet by:> training text and vision on a single stage> a novel V2PE positional encoding> SFT & mixed preference optimizationPaper: InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models (2504.10479)> test-time scaling See translation ❤️ 6 6 👍 2 2 🔥 2 2 👀 1 1 + Reply
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 15 days ago • 464
Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme Paper • 2504.02587 • Published Apr 3 • 30
view post Post 1836 MAYE🎈a from-scratch RL framework for Vision Language Models, released by GAIR - an active research group from the Chinese community.✨Minimal & transparent pipeline with standard tools✨Standardized eval to track training & reflection✨Open Code & Dataset Code: https://github.com/GAIR-NLP/MAYE?tab=readme-ov-fileDataset: ManTle/MAYEPaper: Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme (2504.02587) See translation 1 reply · 👍 4 4 + Reply
view post Post 4746 Qwen 3 can launch very soon. 👀https://github.com/ggml-org/llama.cpp/pull/12828 See translation 3 replies · 🔥 16 16 👀 9 9 ❤️ 8 8 + Reply