harshism1
/

codellama-leetcode-finetuned

Text2Text Generation

Model card Files Files and versions Community

codellama-leetcode-finetuned / README.md

harshism1's picture

Update README.md

019b62b verified 3 days ago

|

history blame contribute delete

2.49 kB

	---
	datasets:
	- greengerong/leetcode
	language:
	- en
	base_model:
	- codellama/CodeLlama-7b-Instruct-hf
	pipeline_tag: text2text-generation
	---

	## 🧠 Fine-tuned CodeLlama on LeetCode Problems

	This model is a fine-tuned version of [`codellama/CodeLlama-7b-Instruct-hf`](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) on the [`greengerong/leetcode`](https://huggingface.co/datasets/greengerong/leetcode) dataset. It has been instruction-tuned to generate Python solutions from LeetCode-style problem descriptions.
	---

	## 📦 Model Formats Available

	- Transformers-compatible (`.safetensors`) — for use via 🤗 Transformers.
	- GGUF (`.gguf`) — for use via [llama.cpp](https://github.com/ggerganov/llama.cpp), including `llama-server`, `llama-cpp-python`, and other compatible tools.

	---

	## 🔗 Example Usage (Transformers)

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

	model_id = "harshism1/codellama-leetcode-finetuned"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)

	pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

	prompt = """You are an AI assistant. Solve the following problem:

	Given an array of integers, return indices of the two numbers such that they add up to a specific target.

	## Solution
	"""

	result = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7)
	print(result[0]["generated_text"])
	```



	## ⚙️ Usage with `llama.cpp`

	You can run the model using tools in the [`llama.cpp`](https://github.com/ggerganov/llama.cpp) ecosystem. Make sure you have the `.gguf` version of the model (e.g., `codellama-leetcode.gguf`).

	### 🐍 Using `llama-cpp-python`

	Install:

	```bash
	pip install llama-cpp-python
	```
	Then use:

	```
	from llama_cpp import Llama

	llm = Llama(
	model_path="codellama-leetcode.gguf",
	n_ctx=4096,
	n_gpu_layers=99 # adjust based on your GPU
	)

	prompt = """### Problem
	Given an array of integers, return indices of the two numbers such that they add up to a specific target.

	## Solution
	"""

	output = llm(prompt, max_tokens=256)
	print(output["choices"][0]["text"])
	```


	### 🖥️ Using llama-server

	Start the server:

	```
	llama-server --model codellama-leetcode.gguf --port 8000 --n_gpu_layers 99
	```

	Then send a request:

	```
	curl http://localhost:8000/completion -d '{
	"prompt": "### Problem\nGiven an array of integers...\n\n## Solution\n",
	"n_predict": 256
	}'
	```