--- datasets: - greengerong/leetcode language: - en base_model: - codellama/CodeLlama-7b-Instruct-hf pipeline_tag: text2text-generation --- ## 🧠 Fine-tuned CodeLlama on LeetCode Problems **This model is a fine-tuned version of [`codellama/CodeLlama-7b-Instruct-hf`](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) on the [`greengerong/leetcode`](https://huggingface.co/datasets/greengerong/leetcode) dataset. It has been instruction-tuned to generate Python solutions from LeetCode-style problem descriptions.** --- ## 📦 Model Formats Available - **Transformers-compatible (`.safetensors`)** — for use via 🤗 Transformers. - **GGUF (`.gguf`)** — for use via [llama.cpp](https://github.com/ggerganov/llama.cpp), including `llama-server`, `llama-cpp-python`, and other compatible tools. --- ## 🔗 Example Usage (Transformers) ```python from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline model_id = "harshism1/codellama-leetcode-finetuned" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) prompt = """You are an AI assistant. Solve the following problem: Given an array of integers, return indices of the two numbers such that they add up to a specific target. ## Solution """ result = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7) print(result[0]["generated_text"]) ``` ## ⚙️ Usage with `llama.cpp` You can run the model using tools in the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) ecosystem. Make sure you have the `.gguf` version of the model (e.g., `codellama-leetcode.gguf`). ### 🐍 Using `llama-cpp-python` Install: ```bash pip install llama-cpp-python ``` Then use: ``` from llama_cpp import Llama llm = Llama( model_path="codellama-leetcode.gguf", n_ctx=4096, n_gpu_layers=99 # adjust based on your GPU ) prompt = """### Problem Given an array of integers, return indices of the two numbers such that they add up to a specific target. ## Solution """ output = llm(prompt, max_tokens=256) print(output["choices"][0]["text"]) ``` ### 🖥️ Using llama-server Start the server: ``` llama-server --model codellama-leetcode.gguf --port 8000 --n_gpu_layers 99 ``` Then send a request: ``` curl http://localhost:8000/completion -d '{ "prompt": "### Problem\nGiven an array of integers...\n\n## Solution\n", "n_predict": 256 }' ```