File size: 3,841 Bytes
f8b5877
 
c5797ea
 
 
 
 
 
 
 
d267c61
f039bd0
da60ab1
f8b5877
71d3528
f8b5877
c5797ea
f8b5877
c5797ea
f8b5877
c5797ea
 
f8b5877
c5797ea
f8b5877
c5797ea
f8b5877
c5797ea
 
f8b5877
c5797ea
 
f8b5877
c5797ea
 
f8b5877
c5797ea
 
f8b5877
c5797ea
 
f8b5877
c5797ea
 
f8b5877
c5797ea
f8b5877
c5797ea
f8b5877
c5797ea
 
f8b5877
c5797ea
f8b5877
c5797ea
 
 
 
 
 
f8b5877
c5797ea
f8b5877
c5797ea
 
 
 
f8b5877
c5797ea
 
 
 
 
f8b5877
c5797ea
f8b5877
c5797ea
 
 
 
 
 
 
f8b5877
c5797ea
 
 
f8b5877
c5797ea
f8b5877
c5797ea
f8b5877
c5797ea
 
 
 
 
f8b5877
c5797ea
f8b5877
c5797ea
f8b5877
c5797ea
 
 
 
f8b5877
c5797ea
f8b5877
c5797ea
f8b5877
c5797ea
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
library_name: transformers
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
tags:
- problem-solve
- text-generation-inference
- code
- math
---
![5.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/QZRJBa0cGFxdF-7jZ-1QH.png)

# **Draco-CoderMini-3B**

> **Draco-CoderMini-3B** is a compact, coding-optimized language model built on the **Qwen2 architecture**, tailored for high-accuracy **code generation**, **debugging**, and **technical reasoning**. With **3 billion parameters**, it strikes a balance between power and deployability, making it an ideal assistant for developers, educators, and engineers working in constrained environments or requiring fast inference.

> \[!note]
> GGUF: [https://huggingface.co/prithivMLmods/Draco-CoderMini-3B-GGUF](https://huggingface.co/prithivMLmods/Draco-CoderMini-3B-GGUF)

---

## **Key Features**

1. **Qwen2 Architecture Core**
   Built on the robust and scalable **Qwen2** transformer backbone, offering solid performance on both single-turn and multi-step code workflows.

2. **Code-First Training Focus**
   Fine-tuned primarily on coding datasets across Python, JavaScript, C++, and Bash, with additional coverage of software documentation, APIs, and debugging tasks.

3. **Multi-Step Reasoning in Code**
   Capable of breaking down complex programming problems, explaining logic, and correcting bugs—ideal for students, engineers, and software instructors.

4. **Structured Format Proficiency**
   Outputs syntactically correct code blocks, JSON, YAML, and Markdown—streamlining integration into tools, notebooks, and docs.

5. **Lightweight Yet Powerful**
   At 3B parameters, it provides strong results without the heavy resource demands of larger models, and is deployable on most modern GPUs or powerful CPUs.

6. **Cross-Language Coding Support**
   Generates and interprets code in 10+ languages with emphasis on real-world application, scripting, and algorithmic problem-solving.

---

## **Quickstart with Transformers**

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Draco-CoderMini-3B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Write a Python function to check if a number is prime."

messages = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
```

---

## **Intended Use**

* Code generation, translation, and refactoring
* Teaching and tutoring in programming concepts
* Technical documentation generation and API auto-fill
* Debugging assistant with error analysis and fixes
* Lightweight deployment in IDEs, coding platforms, and offline environments

---

## **Limitations**

* Smaller context length compared to larger coding models (e.g., >7B)
* May require prompt engineering for deeply nested or obscure code patterns
* Limited fluency in non-programming natural language dialogue
* Not optimized for purely creative writing or storytelling tasks

---

## **References**

1. [Qwen2.5 Technical Report](https://arxiv.org/pdf/2412.15115)
2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)