Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

344

Full-text search

Active filters: quantization

qihoo360/Light-R1-14B-DS-GGUF

Text Generation • 15B • Updated Mar 13 • 56 • 9

btaskel/wai-shuffle-noob-v20-GGUF

Text-to-Image • 3B • Updated Mar 26 • 11

exxocism/featherless-ai_Qwerky-72B-GGUF

Text Generation • 79B • Updated Apr 1

skatardude10/SnowDrogito-RpR-32B_IQ4-XS

33B • Updated May 9 • 103 • 1

DhulipallaGopiChandu/wav2vec2-lora-quantized

Updated Apr 19 • 1

itlwas/Hiber-Multi-10B-Instruct-Q4_K_M-GGUF

Text Generation • 11B • Updated Apr 19 • 10 • 1

Uninformed/QwQ-32B-abliterated-exl2-5bpw-h8

Text Generation • Updated Apr 20 • 3

NoorNizar/Llama-3.2-3B-Instruct-WINT8

Text Generation • 4B • Updated Apr 23 • 5

NoorNizar/Meta-Llama-3-8B-Instruct-WFP8

Text Generation • 8B • Updated Apr 21 • 3

NoorNizar/Meta-Llama-3-8B-Instruct-WINT8

Text Generation • 8B • Updated Apr 21 • 3

btaskel/Illustrious-XL-v2.0-GGUF

Text-to-Image • 3B • Updated Apr 21 • 55 • 4

agoor97/onnx-models

TechyCode/tinyllama-sciq-lora

Text Generation • Updated Apr 23

Sumo10/Phi-4-mini-instruct-AWQ-4bit

1B • Updated Apr 25 • 20 • 1

Sumo10/Llama-3.2-3B-Instruct-AWQ-4bit

0.8B • Updated Apr 25 • 3

NoorNizar/Phi-4-mini-instruct-WINT4

Text Generation • 4B • Updated May 3 • 4

NoorNizar/Meta-Llama-3-8B-Instruct-WINT4

Text Generation • 8B • Updated May 3 • 3

NoorNizar/Llama-3.2-3B-Instruct-WINT4

Text Generation • 4B • Updated May 3 • 4

mengqin1/RedidreamNSFWI1-bnb-4bit

stabilityai/stable-diffusion-3.5-large-tensorrt

Text-to-Image • Updated May 16 • 17

abdou-u/MNLP_M2_quantized_model

Text Generation • 0.4B • Updated May 19 • 13

diffusers/FLUX.1-dev-bnb-8bit

Text-to-Image • Updated May 20 • 173 • 1

diffusers/FLUX.1-dev-torchao-int8

Text-to-Image • Updated May 20 • 469 • 1

diffusers/FLUX.1-dev-torchao-int4

Text-to-Image • Updated May 20 • 33 • 1

diffusers/FLUX.1-dev-torchao-fp8

Text-to-Image • Updated May 21 • 88 • 1

zay25/MNLP_M2_quantized_model

Text Generation • 0.8B • Updated May 27 • 4

textgeflecht/Devstral-Small-2505-FP8-llmcompressor

Text Generation • 24B • Updated May 25 • 105

fukayatti0/nllb-200-distilled-600M-4bit-efqat

Translation • Updated May 28 • 9

HighCWu/FLUX.1-dev-bnb-hqq-4bit

Text-to-Image • Updated May 29 • 16

ConfidentialMind/InternVL3-38B-FP8-Dynamic

Image-Text-to-Text • 38B • Updated 30 days ago • 27 • 1