llama.cpp Support (See Loop-Instruct variant)

#9
by nologik - opened

llama.cpp Support

Note: If you're looking for llama.cpp/GGUF support, please check out the Loop-Instruct variant:

πŸ‘‰ IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct

This model features advanced loop attention mechanism with dual attention and learned gating, now fully supported in llama.cpp!

Pre-converted GGUF models available at: https://huggingface.co/Avarok/IQuest-Coder-V1-40B-Loop-Instruct-GGUF

Sizes: F16 (75GB), Q8_0 (40GB), Q5_K_M (27GB), Q4_K_M (23GB)

llama.cpp PR: https://github.com/ggml-org/llama.cpp/pull/18680

Sign up or log in to comment