Ollama

To get Q4_1 version model, one can simply

ollama pull wavecut/vikhr

or create the model using other bpw versions using Ollama Modelfile

FROM ./vikhr-7b-instruct_0.4.INSERT_YOUR_QUANT_HERE.gguf
PARAMETER temperature 0.25
PARAMETER top_k 50
PARAMETER top_p 0.98
PARAMETER num_ctx 1512
PARAMETER stop <|im_end|>
PARAMETER stop <|im_start|>
SYSTEM """"""
TEMPLATE """<s>{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
"""
ollama create vikhr -f Modelfile
ollama run vikhr
Downloads last month
304
GGUF
Model size
7.63B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support