SystemAdmin123 commited on
Commit
57e4656
·
verified ·
1 Parent(s): 771add3

End of training

Browse files
Files changed (2) hide show
  1. README.md +18 -21
  2. generation_config.json +3 -3
README.md CHANGED
@@ -1,7 +1,6 @@
1
  ---
2
  library_name: transformers
3
- license: apache-2.0
4
- base_model: JackFram/llama-68m
5
  tags:
6
  - axolotl
7
  - generated_from_trainer
@@ -20,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
20
 
21
  axolotl version: `0.6.0`
22
  ```yaml
23
- base_model: JackFram/llama-68m
24
  batch_size: 32
25
  bf16: true
26
  chat_template: tokenizer_default_fallback_alpaca
@@ -58,15 +57,13 @@ sample_packing: false
58
  save_steps: 400
59
  save_total_limit: 1
60
  sequence_len: 2048
61
- special_tokens:
62
- pad_token: </s>
63
  tokenizer_type: LlamaTokenizerFast
64
  torch_dtype: bf16
65
  trust_remote_code: true
66
  val_set_size: 0.1
67
  wandb_entity: ''
68
  wandb_mode: online
69
- wandb_name: JackFram/llama-68m-argilla/databricks-dolly-15k-curated-en
70
  wandb_project: Gradients-On-Demand
71
  wandb_run: your_name
72
  wandb_runid: default
@@ -78,9 +75,9 @@ warmup_ratio: 0.05
78
 
79
  # test-repo
80
 
81
- This model is a fine-tuned version of [JackFram/llama-68m](https://huggingface.co/JackFram/llama-68m) on the argilla/databricks-dolly-15k-curated-en dataset.
82
  It achieves the following results on the evaluation set:
83
- - Loss: 2.9544
84
 
85
  ## Model description
86
 
@@ -112,19 +109,19 @@ The following hyperparameters were used during training:
112
 
113
  | Training Loss | Epoch | Step | Validation Loss |
114
  |:-------------:|:------:|:----:|:---------------:|
115
- | No log | 0.0003 | 1 | 4.0313 |
116
- | 3.4674 | 0.0592 | 200 | 3.5561 |
117
- | 2.9871 | 0.1184 | 400 | 3.5949 |
118
- | 3.1757 | 0.1776 | 600 | 3.3786 |
119
- | 3.3312 | 0.2368 | 800 | 3.4476 |
120
- | 2.6814 | 0.2959 | 1000 | 3.1961 |
121
- | 2.7786 | 0.3551 | 1200 | 3.1063 |
122
- | 2.9228 | 0.4143 | 1400 | 3.0346 |
123
- | 3.1288 | 0.4735 | 1600 | 2.9915 |
124
- | 3.1214 | 0.5327 | 1800 | 2.9728 |
125
- | 3.1533 | 0.5919 | 2000 | 2.9505 |
126
- | 2.4877 | 0.6511 | 2200 | 2.9438 |
127
- | 2.8664 | 0.7103 | 2400 | 2.9544 |
128
 
129
 
130
  ### Framework versions
 
1
  ---
2
  library_name: transformers
3
+ base_model: Xenova/tiny-random-Phi3ForCausalLM
 
4
  tags:
5
  - axolotl
6
  - generated_from_trainer
 
19
 
20
  axolotl version: `0.6.0`
21
  ```yaml
22
+ base_model: Xenova/tiny-random-Phi3ForCausalLM
23
  batch_size: 32
24
  bf16: true
25
  chat_template: tokenizer_default_fallback_alpaca
 
57
  save_steps: 400
58
  save_total_limit: 1
59
  sequence_len: 2048
 
 
60
  tokenizer_type: LlamaTokenizerFast
61
  torch_dtype: bf16
62
  trust_remote_code: true
63
  val_set_size: 0.1
64
  wandb_entity: ''
65
  wandb_mode: online
66
+ wandb_name: Xenova/tiny-random-Phi3ForCausalLM-argilla/databricks-dolly-15k-curated-en
67
  wandb_project: Gradients-On-Demand
68
  wandb_run: your_name
69
  wandb_runid: default
 
75
 
76
  # test-repo
77
 
78
+ This model is a fine-tuned version of [Xenova/tiny-random-Phi3ForCausalLM](https://huggingface.co/Xenova/tiny-random-Phi3ForCausalLM) on the argilla/databricks-dolly-15k-curated-en dataset.
79
  It achieves the following results on the evaluation set:
80
+ - Loss: 8.0331
81
 
82
  ## Model description
83
 
 
109
 
110
  | Training Loss | Epoch | Step | Validation Loss |
111
  |:-------------:|:------:|:----:|:---------------:|
112
+ | No log | 0.0003 | 1 | 10.3773 |
113
+ | 9.2412 | 0.0592 | 200 | 9.2247 |
114
+ | 8.2947 | 0.1184 | 400 | 8.4011 |
115
+ | 8.107 | 0.1776 | 600 | 8.1491 |
116
+ | 8.0544 | 0.2368 | 800 | 8.0751 |
117
+ | 7.8343 | 0.2959 | 1000 | 8.0513 |
118
+ | 7.9614 | 0.3551 | 1200 | 8.0328 |
119
+ | 8.0718 | 0.4143 | 1400 | 8.0292 |
120
+ | 8.0845 | 0.4735 | 1600 | 8.0296 |
121
+ | 7.9257 | 0.5327 | 1800 | 8.0282 |
122
+ | 7.8742 | 0.5919 | 2000 | 8.0276 |
123
+ | 8.0265 | 0.6511 | 2200 | 8.0276 |
124
+ | 7.8113 | 0.7103 | 2400 | 8.0331 |
125
 
126
 
127
  ### Framework versions
generation_config.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "_from_model_config": true,
3
- "bos_token_id": 0,
4
  "do_sample": true,
5
- "eos_token_id": 2,
6
- "pad_token_id": 1,
7
  "transformers_version": "4.48.1"
8
  }
 
1
  {
2
  "_from_model_config": true,
3
+ "bos_token_id": 1,
4
  "do_sample": true,
5
+ "eos_token_id": 32000,
6
+ "pad_token_id": 32000,
7
  "transformers_version": "4.48.1"
8
  }