|
--- |
|
datasets: |
|
- greengerong/leetcode |
|
language: |
|
- en |
|
base_model: |
|
- codellama/CodeLlama-7b-Instruct-hf |
|
pipeline_tag: text2text-generation |
|
--- |
|
|
|
## π§ Fine-tuned CodeLlama on LeetCode Problems |
|
|
|
**This model is a fine-tuned version of [`codellama/CodeLlama-7b-Instruct-hf`](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) on the [`greengerong/leetcode`](https://huggingface.co/datasets/greengerong/leetcode) dataset. It has been instruction-tuned to generate Python solutions from LeetCode-style problem descriptions.** |
|
--- |
|
|
|
## π¦ Model Formats Available |
|
|
|
- **Transformers-compatible (`.safetensors`)** β for use via π€ Transformers. |
|
- **GGUF (`.gguf`)** β for use via [llama.cpp](https://github.com/ggerganov/llama.cpp), including `llama-server`, `llama-cpp-python`, and other compatible tools. |
|
|
|
--- |
|
|
|
## π Example Usage (Transformers) |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline |
|
|
|
model_id = "harshism1/codellama-leetcode-finetuned" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) |
|
|
|
prompt = """You are an AI assistant. Solve the following problem: |
|
|
|
Given an array of integers, return indices of the two numbers such that they add up to a specific target. |
|
|
|
## Solution |
|
""" |
|
|
|
result = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7) |
|
print(result[0]["generated_text"]) |
|
``` |
|
|
|
|
|
|
|
## βοΈ Usage with `llama.cpp` |
|
|
|
You can run the model using tools in the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) ecosystem. Make sure you have the `.gguf` version of the model (e.g., `codellama-leetcode.gguf`). |
|
|
|
### π Using `llama-cpp-python` |
|
|
|
Install: |
|
|
|
```bash |
|
pip install llama-cpp-python |
|
``` |
|
Then use: |
|
|
|
``` |
|
from llama_cpp import Llama |
|
|
|
llm = Llama( |
|
model_path="codellama-leetcode.gguf", |
|
n_ctx=4096, |
|
n_gpu_layers=99 # adjust based on your GPU |
|
) |
|
|
|
prompt = """### Problem |
|
Given an array of integers, return indices of the two numbers such that they add up to a specific target. |
|
|
|
## Solution |
|
""" |
|
|
|
output = llm(prompt, max_tokens=256) |
|
print(output["choices"][0]["text"]) |
|
``` |
|
|
|
|
|
### π₯οΈ Using llama-server |
|
|
|
Start the server: |
|
|
|
``` |
|
llama-server --model codellama-leetcode.gguf --port 8000 --n_gpu_layers 99 |
|
``` |
|
|
|
Then send a request: |
|
|
|
``` |
|
curl http://localhost:8000/completion -d '{ |
|
"prompt": "### Problem\nGiven an array of integers...\n\n## Solution\n", |
|
"n_predict": 256 |
|
}' |
|
``` |
|
|
|
|
|
|