runtime error
ggml-model-q4_1.bin: 36%|ββββ | 3.17G/8.81G [00:09<00:14, 379MB/s][A ggml-model-q4_1.bin: 40%|ββββ | 3.57G/8.81G [00:11<00:14, 358MB/s][A ggml-model-q4_1.bin: 48%|βββββ | 4.22G/8.81G [00:12<00:12, 365MB/s][A ggml-model-q4_1.bin: 52%|ββββββ | 4.59G/8.81G [00:14<00:12, 344MB/s][A ggml-model-q4_1.bin: 60%|ββββββ | 5.26G/8.81G [00:15<00:09, 370MB/s][A ggml-model-q4_1.bin: 64%|βββββββ | 5.64G/8.81G [00:17<00:09, 348MB/s][A ggml-model-q4_1.bin: 72%|ββββββββ | 6.31G/8.81G [00:18<00:06, 394MB/s][A ggml-model-q4_1.bin: 76%|ββββββββ | 6.71G/8.81G [00:19<00:05, 355MB/s][A ggml-model-q4_1.bin: 84%|βββββββββ | 7.36G/8.81G [00:20<00:03, 413MB/s][A ggml-model-q4_1.bin: 88%|βββββββββ | 7.79G/8.81G [00:22<00:02, 356MB/s][A ggml-model-q4_1.bin: 100%|ββββββββββ| 8.81G/8.81G [00:23<00:00, 374MB/s] tokenizer.model: 0%| | 0.00/1.14M [00:00<?, ?B/s][A tokenizer.model: 100%|ββββββββββ| 1.14M/1.14M [00:00<00:00, 1.81MB/s] gguf_init_from_file: invalid magic characters tjgg. error loading model: llama_model_loader: failed to load model from ggml-model-q4_1.bin llama_load_model_from_file: failed to load model AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | Traceback (most recent call last): File "/home/user/app/app.py", line 54, in <module> model = Llama( File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 923, in __init__ self._n_vocab = self.n_vocab() File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 2119, in n_vocab return self._model.n_vocab() File "/home/user/.local/lib/python3.10/site-packages/llama_cpp/llama.py", line 250, in n_vocab assert self.model is not None AssertionError
Container logs:
Fetching error logs...