Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

22,831

Full-text search

Active filters: llama-cpp

CHE-72/Breeze-7B-Instruct-v1_0-Q2_K-GGUF

Text Generation • 7B • Updated Jun 22, 2024 • 1

CHE-72/Qwen1.5-4B-Chat-Q8_0-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 1

CHE-72/Qwen1.5-4B-Chat-Q6_K-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 2

CHE-72/Qwen1.5-4B-Chat-Q5_K_M-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 14

CHE-72/Qwen1.5-4B-Chat-Q5_K_S-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 4

CHE-72/Qwen1.5-4B-Chat-Q5_0-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 2

CHE-72/Qwen1.5-4B-Chat-Q4_K_M-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 11

CHE-72/Qwen1.5-4B-Chat-Q4_K_S-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 3

CHE-72/Qwen1.5-4B-Chat-Q4_0-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 30

CHE-72/Qwen1.5-4B-Chat-Q3_K_L-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 1

CHE-72/Qwen1.5-4B-Chat-Q3_K_M-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 1

CHE-72/Qwen1.5-4B-Chat-Q3_K_S-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 3

CHE-72/Qwen1.5-4B-Chat-Q2_K-GGUF

Text Generation • 4B • Updated Jun 22, 2024 • 33

newsletter/buddhi-128k-chat-7b-Q6_K-GGUF

Text Generation • 7B • Updated Jun 22, 2024 • 1

Nokilos/suzume-llama-3-8B-multilingual-Q4_K_M-GGUF

8B • Updated Jun 22, 2024 • 1

Nokilos/suzume-llama-3-8B-multilingual-Q5_K_S-GGUF

8B • Updated Jun 22, 2024 • 1

acen20/Meta-Llama-3-8B-Q2_K-GGUF

Text Generation • 8B • Updated Jun 22, 2024 • 7

internistai/base-7b-v0.2-Q4_K_M-GGUF

7B • Updated Jun 22, 2024

powermove72/SharkOgno-11b-Passthrough-Q4_K_M-GGUF

11B • Updated Jun 22, 2024

martintomov/Codestral-22B-v0.1-Q4_K_M-GGUF

22B • Updated Jun 22, 2024 • 2

martintomov/omost-llama-3-8b-Q8_0-GGUF

8B • Updated Jun 22, 2024

jeiku/Templar_v1_8B-Q3_K_S-GGUF

8B • Updated Jun 23, 2024 • 3

sugatoray/DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M-GGUF

16B • Updated Jun 23, 2024 • 53

V15h/PLLaMa-7b-instruct-Q4_K_M-GGUF

7B • Updated Jun 23, 2024 • 1

KnutJaegersberg/Qwen2-Deita-500m-Q8_0-GGUF

0.6B • Updated Jun 23, 2024 • 2

sugatoray/DeepSeek-Coder-V2-Lite-Base-Q4_K_M-GGUF

16B • Updated Jun 23, 2024 • 21

stardustcx/airoboros-33b-gpt4-1.4.1-PI-8192-fp16-Q4_K_M-GGUF

33B • Updated Jun 23, 2024 • 37

notstevensalt/L3-8B-Stheno-v3.3-32K-Q5_K_M-GGUF

8B • Updated Jun 23, 2024 • 25

bunnycore/Llama3-OneForAll-8B-Q4_K_M-GGUF

8B • Updated Jun 23, 2024 • 1

markhneedham/Mistral-7B-v0.3-IQ4_NL-GGUF

7B • Updated Jun 23, 2024