Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

compressed-tensors

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

2,348

Full-text search

Active filters: compressed-tensors

aarnphm/llama-4-maverick-17b-128e-instruct-fp8-sharded-tp8

Image-Text-to-Text • Updated Jun 23 • 5

EclairJ/llama-3.1-8B-instruct-lora-W4A16-2048

2B • Updated Jun 23 • 4

rednote-hilab/dots.llm1.inst-FP8-dynamic

Text Generation • 1B • Updated Jun 24 • 26 • 5

alvion427/Mistral-Small-3.2-24B-Instruct-2506-fp8

24B • Updated Jun 23 • 5

EclairJ/llama-3.1-8B-instruct-lora-W8A8-2048

8B • Updated Jun 23 • 4

noneUsername/Homunculus-W8A8

12B • Updated Jun 23 • 5

noneUsername/Mistral-Small-3.2-24B-Instruct-hf-W8A8

24B • Updated Jun 23 • 18 • 1

SaitBurak/qwen3

8B • Updated Jun 24 • 6

Yi30/Llama-3.2-1B-Instruct-NVFP4-llmc

1B • Updated Jun 25 • 18

Yi30/DeepSeek-V2-Lite-NVFP4-llmc

9B • Updated 3 days ago • 12

Daiphuoc/gemma3-w8a8

5B • Updated Jun 25 • 4

nm-testing/Qwen2.5-VL-7B-Instruct-W4A16-G128

3B • Updated Jun 25 • 4

stelterlab/Mistral-Small-3.2-24B-Instruct-2506-FP8

Image-Text-to-Text • Updated Jun 26 • 122k • 4

parasail-ai/Llama-3.1-8B-Instruct-NVFP4A16

5B • Updated Jun 26 • 55

Yi30/Llama-3.3-70B-Instruct-NVFP4-llmc

73B • Updated Jun 28 • 5

jester6136/nanonets-ocr-w4a16-g128

2B • Updated Jun 26 • 4

jester6136/Nanonets-OCR-s-w8a8

4B • Updated Jul 2 • 33 • 1

Berkesule/qwenvl-2.5-7b-gptq-W4A16-quantize-tr-dpo-v2

3B • Updated Jun 26 • 6

wangruiai2023/GLM-4-9B-0414-AWQ-0626dev

Text Generation • 2B • Updated Jun 26 • 25

Berkesule/qwenvl-2.5-7b-gptq-W4816-quantize-tr-dpo-v2

3B • Updated Jun 27 • 30

noneUsername/Qwen3-32B-abliterated-llm-compressor-AWQ

6B • Updated Jun 26 • 27

Yi30/Llama-3.2-1B-Instruct-MXFP4-llmc

1B • Updated Jul 2 • 4

EclairJ/mistral-small-3.2-16bit-W4A16

4B • Updated Jun 27 • 21

Yi30/Llama-3.3-70B-Instruct-MXFP4-llmc

38B • Updated Jun 27 • 4

Yi30/Llama-3.2-1B-Instruct-MXFP8-llmc

2B • Updated Jun 28 • 5

Yi30/DeepSeek-V2-Lite-MXFP8-llmc

16B • Updated Jun 30 • 4

Yi30/Llama-3.3-70B-Instruct-MXFP8-llmc

73B • Updated Jun 28 • 4

RedHatAI/Llama-3.1-70B-Instruct-NVFP4

Text Generation • 41B • Updated Jun 30 • 113

RedHatAI/Llama-3.1-70B-Instruct-NVFP4A16

Text Generation • 41B • Updated Jun 30 • 47

RedHatAI/Qwen3-32B-NVFP4

Text Generation • 19B • Updated Jun 30 • 1.94k • 3