Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

41,488

Full-text search

Active filters: 4-bit

nota-ai/Solar-Open-100B-NotaMoEQuant-Int4

Text Generation • 2B • Updated 2 days ago • 87 • 29

mlx-community/GLM-4.7-Flash-4bit

Text Generation • 30B • Updated 2 days ago • 6.96k • 43

huihui-ai/Huihui-GLM-4.7-Flash-abliterated-mlx-4bit

Text Generation • 30B • Updated 5 days ago • 752 • 6

mlx-community/gpt-oss-20b-MXFP4-Q8

Text Generation • Updated Aug 29, 2025 • 765k • 27

Intel/GLM-ASR-Nano-2512-int4-AutoRound

0.5B • Updated 7 days ago • 74 • 5

0xSero/GLM-4.7-REAP-50-W4A16

Text Generation • 2B • Updated 23 days ago • 6.49k • 64

Disty0/FLUX.2-klein-9B-SDNQ-4bit-dynamic-svd-r32

Text-to-Image • Updated 10 days ago • 2.59k • 8

QuantTrio/Step3-VL-10B-AWQ

Image-Text-to-Text • 10B • Updated 5 days ago • 3.65k • 4

mlx-community/VibeVoice-ASR-4bit

Automatic Speech Recognition • 8B • Updated 5 days ago • 139 • 4

inferencerlabs/Kimi-K2.5-MLX-3.6bit

Text Generation • Updated 33 minutes ago • 4

Qwen/Qwen3-14B-AWQ

Text Generation • 15B • Updated May 21, 2025 • 1.12M • 51

nota-ai/Qwen3-30B-A3B-NotaMoEQuant-Int4

Text Generation • 0.6B • Updated 12 days ago • 32 • 7

lmstudio-community/GLM-4.7-Flash-MLX-4bit

Text Generation • 30B • Updated 5 days ago • 11.7k • 6

Intel/GLM-4.7-Flash-int4-AutoRound

1B • Updated 6 days ago • 446 • 3

mlx-community/Qwen3-TTS-12Hz-0.6B-Base-4bit

Text-to-Speech • 0.4B • Updated 2 days ago • 852 • 3

Intel/DeepSeek-V3.2-int4-AutoRound

Text Generation • 3B • Updated 5 days ago • 45 • 3

TheBloke/MythoMax-L2-13B-GPTQ

Text Generation • 13B • Updated Sep 27, 2023 • 605 • 218

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • 2B • Updated Sep 18, 2024 • 153k • 10

Qwen/Qwen2.5-Coder-32B-Instruct-AWQ

Text Generation • 33B • Updated Nov 18, 2024 • 354k • 32

Qwen/Qwen2.5-VL-7B-Instruct-AWQ

Image-Text-to-Text • 8B • Updated Apr 6, 2025 • 606k • 98

unsloth/gemma-3-12b-it-bnb-4bit

Image-to-Text • 13B • Updated May 12, 2025 • 29.7k • 31

MaziyarPanahi/Qwen3-1.7B-GGUF

Text Generation • 2B • Updated Apr 28, 2025 • 233k • 6

Qwen/Qwen3-32B-AWQ

Text Generation • 33B • Updated May 21, 2025 • 207k • 121

unsloth/medgemma-27b-text-it-bnb-4bit

Image-Text-to-Text • 28B • Updated May 20, 2025 • 332 • 2

169Pi/Alpie-Core

Text Generation • Updated 21 days ago • 56 • 7

mlx-community/gpt-oss-20b-MXFP4-Q4

Text Generation • 21B • Updated Aug 29, 2025 • 7.08k • 14

MaziyarPanahi/NVIDIA-Nemotron-Nano-12B-v2-GGUF

Text Generation • 12B • Updated Nov 28, 2025 • 74.4k • 2

nota-ai/GLM-4.5-Air-NotaMoeQuant-Int4

Text Generation • 1B • Updated 30 days ago • 26 • 5

marksverdhai/vibevoice-7b-bnb-4bit

Text-to-Speech • 10B • Updated 28 days ago • 497 • 3

Disty0/LTX-2-SDNQ-4bit-dynamic

Updated 19 days ago • 487 • 9