Update README.md
Browse files
README.md
CHANGED
@@ -15,9 +15,39 @@ license: cc-by-4.0
|
|
15 |
---
|
16 |
|
17 |
This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method
|
18 |
-
Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed
|
19 |
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
```python
|
22 |
# Load the fine-tuned model
|
23 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
15 |
---
|
16 |
|
17 |
This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method
|
18 |
+
Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed dataset consisting of Solidity constructs (functions, modifiers, mappings, etc.). The model has a maximum sequence length of 256 tokens.
|
19 |
|
20 |
+
General Fine-tuning informations:
|
21 |
+
|
22 |
+
- Epochs: 2
|
23 |
+
- Optimizer: paged AdamW 8-bit
|
24 |
+
- Batch size: 8
|
25 |
+
- LoRA target modules: ["q_proj", "o_proj", "k_proj", "v_proj"]
|
26 |
+
- Quantization type: normal float 4-bit
|
27 |
+
- QLoRA compute type: brain float 16-bit
|
28 |
+
- Total time: 1 hour 23 minutes
|
29 |
+
|
30 |
+
Some of the Hyperparameters were determined using Hyperparameter optimization with Ray Tune. The corresponding result for the best trial were:
|
31 |
+
|
32 |
+
- Learning rate: 0.00016
|
33 |
+
- Weight decay: 0.0534
|
34 |
+
- Warmup steps: 100
|
35 |
+
- Gradient accumulation steps: 2
|
36 |
+
- LoRA rank: 64
|
37 |
+
- LoRA alpha: 64
|
38 |
+
- LoRA dropout: 0.0934665
|
39 |
+
|
40 |
+
The Fine-tuning results are:
|
41 |
+
|
42 |
+
- Training loss: ~0.7
|
43 |
+
- Validation loss: ~0.75
|
44 |
+
|
45 |
+
The model was evaluated with the test split compared to the base model. The metrics were used: Perplexity, BLEU and METEOR. The Perplexity results are:
|
46 |
+
|
47 |
+
- Perplexity Base Model: 12.08
|
48 |
+
- Perplexity Fine-tuned Model: 2.19
|
49 |
+
|
50 |
+
The following code shows an example of how to use the model:
|
51 |
```python
|
52 |
# Load the fine-tuned model
|
53 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|