SystemAdmin123 commited on
Commit
d081ecd
·
verified ·
1 Parent(s): f475e86

End of training

Browse files
Files changed (2) hide show
  1. README.md +44 -44
  2. pytorch_model.bin +1 -1
README.md CHANGED
@@ -76,7 +76,7 @@ xformers_attention: true
76
 
77
  This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
78
  It achieves the following results on the evaluation set:
79
- - Loss: 6.0835
80
 
81
  ## Model description
82
 
@@ -113,50 +113,50 @@ The following hyperparameters were used during training:
113
  | Training Loss | Epoch | Step | Validation Loss |
114
  |:-------------:|:------:|:----:|:---------------:|
115
  | No log | 0.0112 | 1 | 10.4228 |
116
- | 10.127 | 0.2247 | 20 | 9.8631 |
117
- | 9.0392 | 0.4494 | 40 | 8.7402 |
118
- | 8.1126 | 0.6742 | 60 | 7.9188 |
119
- | 7.5512 | 0.8989 | 80 | 7.4579 |
120
  | 7.2769 | 1.1236 | 100 | 7.2770 |
121
- | 7.1383 | 1.3483 | 120 | 7.1767 |
122
- | 7.0576 | 1.5730 | 140 | 7.0574 |
123
- | 6.9563 | 1.7978 | 160 | 6.9376 |
124
- | 6.8782 | 2.0225 | 180 | 6.8207 |
125
- | 6.7025 | 2.2472 | 200 | 6.7210 |
126
- | 6.5911 | 2.4719 | 220 | 6.6361 |
127
- | 6.498 | 2.6966 | 240 | 6.5573 |
128
- | 6.4453 | 2.9213 | 260 | 6.4724 |
129
- | 6.2635 | 3.1461 | 280 | 6.4125 |
130
- | 6.2357 | 3.3708 | 300 | 6.3663 |
131
- | 6.2741 | 3.5955 | 320 | 6.3164 |
132
- | 6.2488 | 3.8202 | 340 | 6.2884 |
133
- | 6.1751 | 4.0449 | 360 | 6.2413 |
134
- | 6.0513 | 4.2697 | 380 | 6.2190 |
135
- | 6.0156 | 4.4944 | 400 | 6.1959 |
136
- | 6.0039 | 4.7191 | 420 | 6.1759 |
137
- | 6.0234 | 4.9438 | 440 | 6.1539 |
138
- | 5.9595 | 5.1685 | 460 | 6.1439 |
139
- | 6.0205 | 5.3933 | 480 | 6.1280 |
140
- | 5.9366 | 5.6180 | 500 | 6.1228 |
141
- | 5.8248 | 5.8427 | 520 | 6.1082 |
142
- | 5.8747 | 6.0674 | 540 | 6.1063 |
143
- | 5.837 | 6.2921 | 560 | 6.1010 |
144
- | 5.8512 | 6.5169 | 580 | 6.0969 |
145
- | 5.929 | 6.7416 | 600 | 6.0999 |
146
- | 5.8924 | 6.9663 | 620 | 6.0975 |
147
- | 5.8913 | 7.1910 | 640 | 6.0956 |
148
- | 5.8253 | 7.4157 | 660 | 6.0936 |
149
- | 5.8198 | 7.6404 | 680 | 6.0885 |
150
- | 5.8615 | 7.8652 | 700 | 6.0869 |
151
- | 5.8929 | 8.0899 | 720 | 6.0945 |
152
- | 5.8676 | 8.3146 | 740 | 6.0892 |
153
- | 5.9057 | 8.5393 | 760 | 6.0875 |
154
- | 5.8127 | 8.7640 | 780 | 6.0881 |
155
- | 5.7864 | 8.9888 | 800 | 6.0902 |
156
- | 5.8074 | 9.2135 | 820 | 6.0924 |
157
- | 5.8298 | 9.4382 | 840 | 6.0843 |
158
- | 5.8487 | 9.6629 | 860 | 6.0887 |
159
- | 5.8496 | 9.8876 | 880 | 6.0835 |
160
 
161
 
162
  ### Framework versions
 
76
 
77
  This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
78
  It achieves the following results on the evaluation set:
79
+ - Loss: 6.0848
80
 
81
  ## Model description
82
 
 
113
  | Training Loss | Epoch | Step | Validation Loss |
114
  |:-------------:|:------:|:----:|:---------------:|
115
  | No log | 0.0112 | 1 | 10.4228 |
116
+ | 10.127 | 0.2247 | 20 | 9.8632 |
117
+ | 9.0393 | 0.4494 | 40 | 8.7403 |
118
+ | 8.1127 | 0.6742 | 60 | 7.9189 |
119
+ | 7.5513 | 0.8989 | 80 | 7.4579 |
120
  | 7.2769 | 1.1236 | 100 | 7.2770 |
121
+ | 7.1384 | 1.3483 | 120 | 7.1767 |
122
+ | 7.0576 | 1.5730 | 140 | 7.0575 |
123
+ | 6.9564 | 1.7978 | 160 | 6.9379 |
124
+ | 6.8785 | 2.0225 | 180 | 6.8208 |
125
+ | 6.7027 | 2.2472 | 200 | 6.7212 |
126
+ | 6.5913 | 2.4719 | 220 | 6.6362 |
127
+ | 6.498 | 2.6966 | 240 | 6.5572 |
128
+ | 6.4453 | 2.9213 | 260 | 6.4721 |
129
+ | 6.2635 | 3.1461 | 280 | 6.4126 |
130
+ | 6.236 | 3.3708 | 300 | 6.3658 |
131
+ | 6.2733 | 3.5955 | 320 | 6.3162 |
132
+ | 6.2472 | 3.8202 | 340 | 6.2870 |
133
+ | 6.1738 | 4.0449 | 360 | 6.2401 |
134
+ | 6.0509 | 4.2697 | 380 | 6.2184 |
135
+ | 6.0158 | 4.4944 | 400 | 6.1959 |
136
+ | 6.0043 | 4.7191 | 420 | 6.1770 |
137
+ | 6.0249 | 4.9438 | 440 | 6.1570 |
138
+ | 5.9625 | 5.1685 | 460 | 6.1471 |
139
+ | 6.0231 | 5.3933 | 480 | 6.1303 |
140
+ | 5.9395 | 5.6180 | 500 | 6.1241 |
141
+ | 5.8278 | 5.8427 | 520 | 6.1094 |
142
+ | 5.8774 | 6.0674 | 540 | 6.1078 |
143
+ | 5.8393 | 6.2921 | 560 | 6.1025 |
144
+ | 5.8534 | 6.5169 | 580 | 6.0983 |
145
+ | 5.9313 | 6.7416 | 600 | 6.1013 |
146
+ | 5.8947 | 6.9663 | 620 | 6.0989 |
147
+ | 5.8936 | 7.1910 | 640 | 6.0971 |
148
+ | 5.8275 | 7.4157 | 660 | 6.0950 |
149
+ | 5.822 | 7.6404 | 680 | 6.0899 |
150
+ | 5.8637 | 7.8652 | 700 | 6.0883 |
151
+ | 5.8951 | 8.0899 | 720 | 6.0958 |
152
+ | 5.8697 | 8.3146 | 740 | 6.0906 |
153
+ | 5.9076 | 8.5393 | 760 | 6.0889 |
154
+ | 5.8149 | 8.7640 | 780 | 6.0894 |
155
+ | 5.7888 | 8.9888 | 800 | 6.0916 |
156
+ | 5.8096 | 9.2135 | 820 | 6.0938 |
157
+ | 5.8319 | 9.4382 | 840 | 6.0857 |
158
+ | 5.8508 | 9.6629 | 860 | 6.0901 |
159
+ | 5.8517 | 9.8876 | 880 | 6.0848 |
160
 
161
 
162
  ### Framework versions
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:137e48da8369e255736de16ead56592b161f61e345ccf6323d753997d4e0a736
3
  size 34219693
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:797fbcef7a7434b5567ed5acc5c3f32e59a15ebee16c58da2131d7d750e3ff42
3
  size 34219693