Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

27,148

Full-text search

Active filters: 8-bit

openai/gpt-oss-120b

Text Generation • 120B • Updated 1 day ago • 325k • • 3.06k

openai/gpt-oss-20b

Text Generation • 22B • Updated 1 day ago • 1.26M • • 2.62k

unsloth/gpt-oss-20b

Text Generation • 22B • Updated about 2 hours ago • 3.63k • 23

lmstudio-community/gpt-oss-20b-MLX-8bit

Text Generation • 21B • Updated 4 days ago • 566k • 20

lmstudio-community/gpt-oss-120b-MLX-8bit

Text Generation • 117B • Updated 4 days ago • 183k • 7

unsloth/gpt-oss-120b

Text Generation • 120B • Updated about 2 hours ago • 488 • 6

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-8bit

Text Generation • 1B • Updated 3 days ago • 1.74k • 6

mlx-community/Qwen3-4B-Instruct-2507-8bit

Text Generation • 1B • Updated 3 days ago • 380 • 4

MaziyarPanahi/Phi-4-mini-instruct-GGUF

Text Generation • 4B • Updated Mar 1 • 177k • 9

mlx-community/XBai-o4-8bit

Text Generation • 33B • Updated 8 days ago • 529 • 5

MaziyarPanahi/Phi-3.5-mini-instruct-GGUF

Text Generation • 4B • Updated Aug 20, 2024 • 221k • 18

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 4.71k • 1.15k

huizimao/gpt-oss-20b-uncensored-mxfp4

21B • Updated about 20 hours ago • 2

MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF

Text Generation • 8B • Updated Jul 23, 2024 • 176k • 24

MaziyarPanahi/solar-pro-preview-instruct-GGUF

Text Generation • 22B • Updated Sep 13, 2024 • 172k • 26

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • 2B • Updated Sep 18, 2024 • 172k • 7

MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF

Text Generation • 1B • Updated Sep 25, 2024 • 180k • 16

mlx-community/airoboros-33b-gpt4-1.4

9B • Updated Oct 19, 2024 • 334 • 2

MaziyarPanahi/Llama-3.3-70B-Instruct-GGUF

Text Generation • 71B • Updated Dec 7, 2024 • 226k • 16

NeoChen1024/Ministral-8B-Instruct-2410-W8A8

8B • Updated Jan 17 • 11 • 2

RedHatAI/Llama-3.3-70B-Instruct-quantized.w8a8

Text Generation • 71B • Updated May 30 • 46.7k • 11

MaziyarPanahi/gemma-3-1b-it-GGUF

Text Generation • 1.0B • Updated Mar 12 • 184k • 8

nvidia/Llama-4-Scout-17B-16E-Instruct-FP4

62B • Updated Apr 14 • 2k • 2

MaziyarPanahi/Qwen3-0.6B-GGUF

Text Generation • 0.8B • Updated Apr 28 • 173k • 5

MaziyarPanahi/Qwen3-14B-GGUF

Text Generation • 15B • Updated Apr 28 • 173k • 3

Qwen/Qwen3-1.7B-GPTQ-Int8

Text Generation • 0.7B • Updated May 21 • 1.95k • 3

Qwen/Qwen3-0.6B-GPTQ-Int8

Text Generation • 0.3B • Updated May 21 • 3.11k • 8

iqbalamo93/Phi-4-mini-instruct-GPTQ-8bit

Text Generation • 1B • Updated May 20 • 115k • 1

amd/DeepSeek-R1-MXFP4-Preview

357B • Updated 4 days ago • 1.56k • 1

Qwen/Qwen3-0.6B-MLX-8bit

Text Generation • 0.2B • Updated Jul 7 • 243 • 2