view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks By nvidia and 4 others • 3 days ago • 39
view article Article Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training By siro1 and 4 others • 7 days ago • 44
view article Article Vision Language Model Alignment in TRL ⚡️ By sergiopaniego and 4 others • 8 days ago • 45
view article Article Introducing Command A Vision: Multimodal AI built for Business By CohereLabs and 3 others • 14 days ago • 61
EDGE-GRPO: Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity Paper • 2507.21848 • Published 16 days ago • 7
GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface Paper • 2507.18546 • Published 21 days ago • 18
ULD Loss (Universal LLMs Distillation) Collection The ULD loss, based on optimal transport, enables distillation across different LLM families without requiring shared tokenizers. • 14 items • Updated 30 days ago • 2
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper • 2411.10442 • Published Nov 15, 2024 • 87
ThinkPRM Collection Process Reward Models that Think -- https://arxiv.org/abs/2504.16828 • 8 items • Updated 16 days ago • 3
view article Article Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure By jcudit • Jul 8 • 10
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • Jul 9 • 643
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 625
view article Article Bringing Fusion Down to Earth: ML for Stellarator Optimization By cgeorgiaw • Jul 2 • 72
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 124
view article Article Gemma 3n fully available in the open-source ecosystem! By ariG23498 and 7 others • Jun 26 • 114
view article Article xLSTM-based time series model TiRex significantly outperforms competing models in forecasting accuracy By BobWue • Jun 4 • 12
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • Jun 3 • 82