--- # Model card metadata following Hugging Face specification: # https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1 # Documentation: https://huggingface.co/docs/hub/model-cards license: mit tags: - llama-cpp-python - cuda - nvidia - blackwell - windows - prebuilt-wheels - python - machine-learning - large-language-models - gpu-acceleration --- # llama-cpp-python 0.3.9 Prebuilt Wheel with CUDA Support for Windows This repository provides a prebuilt Python wheel for **llama-cpp-python** (version 0.3.9) with NVIDIA CUDA support, for Windows 10/11 (x64) systems. This wheel enables GPU-accelerated inference for large language models (LLMs) using the `llama.cpp` library, simplifying setup by eliminating the need to compile from source. The wheel is compatible with Python 3.10 and supports NVIDIA GPUs, including the latest Blackwell architecture. ## Available Wheel - `llama_cpp_python-0.3.9-cp310-cp310-win_amd64.whl` (Python 3.10, CUDA 12.8) ## Compatibility The prebuilt wheels are designed for NVIDIA Blackwell GPUs but have been tested and confirmed compatible with previous-generation NVIDIA GPUs, including: - NVIDIA RTX 5090 - NVIDIA RTX 3090 ## Installation To install the wheel, use the following command in your Python 3.10 environment: ```bash pip install llama_cpp_python-0.3.9-cp310-cp310-win_amd64.whl