Running 3.05k 3.05k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
deepcogito/cogito-v2-preview-deepseek-671B-MoE Text Generation • 671B • Updated 15 days ago • 354 • 32
mlx-community/Qwen3-Coder-480B-A35B-Instruct-4bit Text Generation • 480B • Updated 23 days ago • 4.46k • 18