SystemAdmin123
/

test-repo

@@ -76,7 +76,7 @@ xformers_attention: true
 This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 6.0835
 ## Model description
@@ -113,50 +113,50 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | No log        | 0.0112 | 1    | 10.4228         |
-| 10.127        | 0.2247 | 20   | 9.8631          |
-| 9.0392        | 0.4494 | 40   | 8.7402          |
-| 8.1126        | 0.6742 | 60   | 7.9188          |
-| 7.5512        | 0.8989 | 80   | 7.4579          |
 | 7.2769        | 1.1236 | 100  | 7.2770          |
-| 7.1383        | 1.3483 | 120  | 7.1767          |
-| 7.0576        | 1.5730 | 140  | 7.0574          |
-| 6.9563        | 1.7978 | 160  | 6.9376          |
-| 6.8782        | 2.0225 | 180  | 6.8207          |
-| 6.7025        | 2.2472 | 200  | 6.7210          |
-| 6.5911        | 2.4719 | 220  | 6.6361          |
-| 6.498         | 2.6966 | 240  | 6.5573          |
-| 6.4453        | 2.9213 | 260  | 6.4724          |
-| 6.2635        | 3.1461 | 280  | 6.4125          |
-| 6.2357        | 3.3708 | 300  | 6.3663          |
-| 6.2741        | 3.5955 | 320  | 6.3164          |
-| 6.2488        | 3.8202 | 340  | 6.2884          |
-| 6.1751        | 4.0449 | 360  | 6.2413          |
-| 6.0513        | 4.2697 | 380  | 6.2190          |
-| 6.0156        | 4.4944 | 400  | 6.1959          |
-| 6.0039        | 4.7191 | 420  | 6.1759          |
-| 6.0234        | 4.9438 | 440  | 6.1539          |
-| 5.9595        | 5.1685 | 460  | 6.1439          |
-| 6.0205        | 5.3933 | 480  | 6.1280          |
-| 5.9366        | 5.6180 | 500  | 6.1228          |
-| 5.8248        | 5.8427 | 520  | 6.1082          |
-| 5.8747        | 6.0674 | 540  | 6.1063          |
-| 5.837         | 6.2921 | 560  | 6.1010          |
-| 5.8512        | 6.5169 | 580  | 6.0969          |
-| 5.929         | 6.7416 | 600  | 6.0999          |
-| 5.8924        | 6.9663 | 620  | 6.0975          |
-| 5.8913        | 7.1910 | 640  | 6.0956          |
-| 5.8253        | 7.4157 | 660  | 6.0936          |
-| 5.8198        | 7.6404 | 680  | 6.0885          |
-| 5.8615        | 7.8652 | 700  | 6.0869          |
-| 5.8929        | 8.0899 | 720  | 6.0945          |
-| 5.8676        | 8.3146 | 740  | 6.0892          |
-| 5.9057        | 8.5393 | 760  | 6.0875          |
-| 5.8127        | 8.7640 | 780  | 6.0881          |
-| 5.7864        | 8.9888 | 800  | 6.0902          |
-| 5.8074        | 9.2135 | 820  | 6.0924          |
-| 5.8298        | 9.4382 | 840  | 6.0843          |
-| 5.8487        | 9.6629 | 860  | 6.0887          |
-| 5.8496        | 9.8876 | 880  | 6.0835          |
 ### Framework versions

 This model is a fine-tuned version of [fxmarty/small-llama-testing](https://huggingface.co/fxmarty/small-llama-testing) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 6.0848
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | No log        | 0.0112 | 1    | 10.4228         |
+| 10.127        | 0.2247 | 20   | 9.8632          |
+| 9.0393        | 0.4494 | 40   | 8.7403          |
+| 8.1127        | 0.6742 | 60   | 7.9189          |
+| 7.5513        | 0.8989 | 80   | 7.4579          |
 | 7.2769        | 1.1236 | 100  | 7.2770          |
+| 7.1384        | 1.3483 | 120  | 7.1767          |
+| 7.0576        | 1.5730 | 140  | 7.0575          |
+| 6.9564        | 1.7978 | 160  | 6.9379          |
+| 6.8785        | 2.0225 | 180  | 6.8208          |
+| 6.7027        | 2.2472 | 200  | 6.7212          |
+| 6.5913        | 2.4719 | 220  | 6.6362          |
+| 6.498         | 2.6966 | 240  | 6.5572          |
+| 6.4453        | 2.9213 | 260  | 6.4721          |
+| 6.2635        | 3.1461 | 280  | 6.4126          |
+| 6.236         | 3.3708 | 300  | 6.3658          |
+| 6.2733        | 3.5955 | 320  | 6.3162          |
+| 6.2472        | 3.8202 | 340  | 6.2870          |
+| 6.1738        | 4.0449 | 360  | 6.2401          |
+| 6.0509        | 4.2697 | 380  | 6.2184          |
+| 6.0158        | 4.4944 | 400  | 6.1959          |
+| 6.0043        | 4.7191 | 420  | 6.1770          |
+| 6.0249        | 4.9438 | 440  | 6.1570          |
+| 5.9625        | 5.1685 | 460  | 6.1471          |
+| 6.0231        | 5.3933 | 480  | 6.1303          |
+| 5.9395        | 5.6180 | 500  | 6.1241          |
+| 5.8278        | 5.8427 | 520  | 6.1094          |
+| 5.8774        | 6.0674 | 540  | 6.1078          |
+| 5.8393        | 6.2921 | 560  | 6.1025          |
+| 5.8534        | 6.5169 | 580  | 6.0983          |
+| 5.9313        | 6.7416 | 600  | 6.1013          |
+| 5.8947        | 6.9663 | 620  | 6.0989          |
+| 5.8936        | 7.1910 | 640  | 6.0971          |
+| 5.8275        | 7.4157 | 660  | 6.0950          |
+| 5.822         | 7.6404 | 680  | 6.0899          |
+| 5.8637        | 7.8652 | 700  | 6.0883          |
+| 5.8951        | 8.0899 | 720  | 6.0958          |
+| 5.8697        | 8.3146 | 740  | 6.0906          |
+| 5.9076        | 8.5393 | 760  | 6.0889          |
+| 5.8149        | 8.7640 | 780  | 6.0894          |
+| 5.7888        | 8.9888 | 800  | 6.0916          |
+| 5.8096        | 9.2135 | 820  | 6.0938          |
+| 5.8319        | 9.4382 | 840  | 6.0857          |
+| 5.8508        | 9.6629 | 860  | 6.0901          |
+| 5.8517        | 9.8876 | 880  | 6.0848          |
 ### Framework versions

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:137e48da8369e255736de16ead56592b161f61e345ccf6323d753997d4e0a736
 size 34219693

 version https://git-lfs.github.com/spec/v1
+oid sha256:797fbcef7a7434b5567ed5acc5c3f32e59a15ebee16c58da2131d7d750e3ff42
 size 34219693