llama.cpp Support (See Loop-Instruct variant)
#9
by
nologik
- opened
llama.cpp Support
Note: If you're looking for llama.cpp/GGUF support, please check out the Loop-Instruct variant:
π IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct
This model features advanced loop attention mechanism with dual attention and learned gating, now fully supported in llama.cpp!
Pre-converted GGUF models available at: https://huggingface.co/Avarok/IQuest-Coder-V1-40B-Loop-Instruct-GGUF
Sizes: F16 (75GB), Q8_0 (40GB), Q5_K_M (27GB), Q4_K_M (23GB)
llama.cpp PR: https://github.com/ggml-org/llama.cpp/pull/18680