Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

22,739

Full-text search

Active filters: llama-cpp

saejoon/gemma-2-27b-it-Q4_K_M-GGUF

Text Generation • 27B • Updated Jul 8, 2024 • 6 • 2

Rivaidan/Smegmma-9B-v1-Q8_0-GGUF

9B • Updated Jul 8, 2024 • 23

notjjustnumbers/madlad400-3b-mt-Q4_K_M-GGUF

Translation • 3B • Updated Jul 8, 2024 • 60 • 1

ethangoh/granite-8b-code-instruct-Q4_K_M-GGUF

Text Generation • 8B • Updated Jul 8, 2024 • 3

HenryyTwitchyfinger/L3-8B-Stheno-v3.2-Q4_K_M-GGUF

8B • Updated Jul 8, 2024 • 3

utterlygreat/omost-llama-3-8b-IQ4_NL-GGUF

8B • Updated Jul 8, 2024

teemperor/Phi-3-medium-128k-instruct-Q6_K-GGUF

Text Generation • 14B • Updated Jul 8, 2024 • 1

zhezhe/Gemma-2-9B-Chinese-Chat-Q4_K_M-GGUF

Text Generation • 9B • Updated Jul 8, 2024 • 1

NikolayKozloff/madlad400-3b-mt-Q8_0-GGUF

Translation • 3B • Updated Jul 8, 2024 • 25 • 1

NikolayKozloff/madlad400-10b-mt-Q8_0-GGUF

Translation • 11B • Updated Jul 8, 2024 • 24 • 3

NikolayKozloff/madlad400-10b-mt-Q6_K-GGUF

Translation • 11B • Updated Jul 8, 2024 • 4 • 1

Cran-May/internlm2_5-7b-chat-Q4_K_M-GGUF

Text Generation • 8B • Updated Jul 8, 2024 • 1

NikolayKozloff/sqt5-xl-Albanian-shqip-llama.cpp-compatible-Q8_0-GGUF

3B • Updated Jul 8, 2024 • 4 • 1

dimcha/mxbai-embed-large-v1-Q4_K_M-GGUF

Feature Extraction • 0.3B • Updated Jul 8, 2024 • 5

bmi-labmedinfo/Igea-3B-v0.1-GGUF

3B • Updated Jul 15, 2024

Cran-May/internlm2_5-7b-chat-IQ4_XS-GGUF

Text Generation • 8B • Updated Jul 8, 2024 • 2

carterprince/google-gemma-2-27b-it-ortho-Q4_K_S-GGUF

27B • Updated Jul 8, 2024 • 2 • 1

mrmage/Qwen2-0.5B-Instruct-Q4_K_M-GGUF

Text Generation • 0.5B • Updated Jul 8, 2024 • 11 • 1

mrmage/Qwen2-0.5B-Instruct-Q3_K_M-GGUF

Text Generation • 0.5B • Updated Jul 8, 2024 • 7

ZappY-AI/medllama3-v20-Q4_K_M-GGUF

8B • Updated Jul 8, 2024 • 1

martintomov/gemma-2-27b-it-Q8_0-GGUF

Text Generation • 27B • Updated Jul 8, 2024 • 4

jorismathijssen/t5-base-Q4_K_M-GGUF

Translation • 0.2B • Updated Jul 8, 2024 • 15 • 1

faceradix/Daredevil-8B-abliterated-Q4_K_M-GGUF

8B • Updated Jul 8, 2024 • 9

martintomov/gemma-2-9b-it-Q8_0-GGUF

Text Generation • 9B • Updated Jul 8, 2024 • 2

Nialixus/Meta-Llama-3-8B-Q4_K_M-GGUF

Text Generation • 8B • Updated Jul 8, 2024 • 4

saejoon/SOLAR-10.7B-Instruct-v1.0-Q4_K_M-GGUF

11B • Updated Jul 9, 2024 • 5

aifeifei798/Meta-Llama-3-8B-Instruct-Q5_K_M-GGUF

Text Generation • 8B • Updated Jul 9, 2024 • 4

NikolayKozloff/bella-2-8b-Q8_0-GGUF

Text Generation • 8B • Updated Jul 9, 2024 • 7 • 1

NikolayKozloff/Storm-7B-Q8_0-GGUF

7B • Updated Jul 9, 2024 • 1 • 2

NikolayKozloff/Einstein-v7-Qwen2-7B-Q8_0-GGUF

8B • Updated Jul 9, 2024 • 10 • 1