Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 13
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 21 days ago • 163
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 59
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 21 days ago • 61
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19, 2024 • 40