# MPT | |
Ref: https://github.com/mosaicml/llm-foundry#mpt | |
## Usage | |
```bash | |
# get the repo and build it | |
git clone https://github.com/ggerganov/ggml | |
cd ggml | |
mkdir build && cd build | |
cmake .. | |
make -j | |
# get the model from HuggingFace | |
# be sure to have git-lfs installed | |
git clone https://huggingface.co/mosaicml/mpt-30b | |
# convert model to FP16 | |
python3 ../examples/mpt/convert-h5-to-ggml.py ./mpt-30b 1 | |
# run inference using FP16 precision | |
./bin/mpt -m ./mpt-30b/ggml-model-f16.bin -p "I believe the meaning of life is" -t 8 -n 64 | |
# quantize the model to 5-bits using Q5_0 quantization | |
./bin/mpt-quantize ./mpt-30b/ggml-model-f16.bin ./mpt-30b/ggml-model-q5_0.bin q5_0 | |
``` | |