These are quick GGUF quantizations of DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1.
They were done for testing purposes and include:
- one with an older llama.cpp version without bpe pre-tokenizer fix, done per fp16 binary
- one with an older llama.cpp version without bpe pre-tokenizer fix, done per fp32 binary
- one with a recent version and the bpefix
Currently the GGUFs perform below expectations, the -mlx performs best in comparison. Any ideas why?
- Downloads last month
- 1
Hardware compatibility
Log In
to view the estimation
4-bit
32-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support