metadata
license: other
language:
- en
tags:
- causal-lm
- code
metrics:
- code_eval
library_name: transformers
model-index:
- name: dgtalbug/stable-code-instruct-3b
results:
- task:
type: text-generation
dataset:
type: nuprl/MultiPL-E
name: MultiPL-HumanEval (Python)
metrics:
- name: pass@1
type: pass@1
value: 32.4
- task:
type: text-generation
dataset:
type: nuprl/MultiPL-E
name: MultiPL-HumanEval (C++)
metrics:
- name: pass@1
type: pass@1
value: 30.9
- task:
type: text-generation
dataset:
type: nuprl/MultiPL-E
name: MultiPL-HumanEval (Java)
metrics:
- name: pass@1
type: pass@1
value: 32.1
- task:
type: text-generation
dataset:
type: nuprl/MultiPL-E
name: MultiPL-HumanEval (JavaScript)
metrics:
- name: pass@1
type: pass@1
value: 32.1
- task:
type: text-generation
dataset:
type: nuprl/MultiPL-E
name: MultiPL-HumanEval (PHP)
metrics:
- name: pass@1
type: pass@1
value: 24.2
- task:
type: text-generation
dataset:
type: nuprl/MultiPL-E
name: MultiPL-HumanEval (Rust)
metrics:
- name: pass@1
type: pass@1
value: 23
Stable Code Instruct 3B — Base Model
This repository stores an unchanged copy of
stabilityai/stable-code-instruct-3b
for use as a base model in future fine‑tuning projects (including Stephen).
📌 About the Model
stable-code-instruct-3b
is a 2.7B parameter decoder-only transformer from Stability AI, tuned for multi‑language code generation and conversational coding assistance.
It is suitable as a starting point for specialized code assistants,
including fine‑tuned variants with domain‑specific datasets.
Key Features:
- General purpose code generation across multiple programming languages.
- Instruction‑tuned for better conversational performance.
- Strong performance on MultiPL-E benchmarks.
📊 Performance (MultiPL-E Benchmark)
Language | pass@1 |
---|---|
Python | 32.4% |
C++ | 30.9% |
Java | 32.1% |
JavaScript | 32.1% |
PHP | 24.2% |
Rust | 23.0% |
🚀 Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "dgtalbug/stable-code-instruct-3b"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, trust_remote_code=True
).cuda().eval()
messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to reverse a string."}
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
tokens = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.5,
top_p=0.95,
top_k=100,
do_sample=True,
use_cache=True
)
output = tokenizer.batch_decode(tokens[:, inputs.input_ids.shape[-1]:], skip_special_tokens=True)[0]
print(output)
📜 License
This model follows the Stability AI Community License.
For commercial use, refer to Stability AI licensing terms.
📌 Note for Fine‑Tuning
This repository is not modified — it is kept as a clean base model for derivative works.
Fine‑tuned versions (e.g., Stephen) will be released in separate repositories.