Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published 4 days ago • 47
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published Oct 6, 2025 • 506
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian Paper • 2509.05668 • Published Sep 6, 2025 • 6
Llama-GENBA-10B: A Trilingual Large Language Model for German, English and Bavarian Paper • 2509.05668 • Published Sep 6, 2025 • 6