mobiuslabsgmbh/Meta-Llama-3-8B-Instruct_4bitgs64_hqq_hf Text Generation • 5B • Updated May 23 • 50 • 2
mobiuslabsgmbh/DeepSeek-R1-ReDistill-Llama3-8B-v1.1 Text Generation • 8B • Updated Jan 30 • 43 • • 11
openai/whisper-large-v3-turbo Automatic Speech Recognition • 0.8B • Updated Oct 4, 2024 • 3.38M • • 2.54k
view article Article Unlocking Longer Generation with Key-Value Cache Quantization By RaushanTurganbay • May 16, 2024 • 49
view post Post 2106 Releasing HQQ Llama-3.1-70b 4-bit quantized version! Check it out at mobiuslabsgmbh/Llama-3.1-70b-instruct_4bitgs64_hqq. Achieves 99% of the base model performance across various benchmarks! Details in the model card. 🔥 8 8 + Reply