Model save

Browse files

Files changed (7) hide show

README.md +15 -20
all_results.json +4 -4
config.json +1 -1
runs/Sep24_14-18-45_qa-a100-004.crc.nd.edu/events.out.tfevents.1727202087.qa-a100-004.crc.nd.edu.1892028.0 +3 -0
train_results.json +4 -4
trainer_state.json +4 -4
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -3,15 +3,10 @@ library_name: transformers
 license: apache-2.0
 base_model: alignment-handbook/zephyr-7b-sft-full
 tags:
-- alignment-handbook
-- trl
-- dpo
-- generated_from_trainer
 - trl
 - dpo
 - generated_from_trainer
-datasets:
-- HuggingFaceH4/ultrafeedback_binarized
 model-index:
 - name: zephyr-7b-align-scan-0.0-0.2-polynomial-3
   results: []
@@ -22,17 +17,17 @@ should probably proofread and complete it, then remove this comment. -->
 # zephyr-7b-align-scan-0.0-0.2-polynomial-3
-This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the HuggingFaceH4/ultrafeedback_binarized dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7011
-- Rewards/chosen: -1.4864
-- Rewards/rejected: -2.0925
 - Rewards/accuracies: 0.3313
-- Rewards/margins: 0.6061
-- Logps/rejected: -89.9617
-- Logps/chosen: -80.7659
-- Logits/rejected: -2.4878
-- Logits/chosen: -2.5051
 ## Model description
@@ -51,7 +46,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2.1445950529764786e-07
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
@@ -67,10 +62,10 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
-|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.5502        | 1.0417 | 100  | 0.6403          | 0.2781         | 0.0281           | 0.3254             | 0.2500          | -81.0099       | -73.3173     | -2.5124         | -2.5290       |
-| 0.377         | 2.0833 | 200  | 0.6467          | -0.1914        | -0.6247          | 0.3313             | 0.4333          | -83.7654       | -75.2992     | -2.5078         | -2.5251       |
 ### Framework versions

 license: apache-2.0
 base_model: alignment-handbook/zephyr-7b-sft-full
 tags:
 - trl
 - dpo
+- alignment-handbook
 - generated_from_trainer
 model-index:
 - name: zephyr-7b-align-scan-0.0-0.2-polynomial-3
   results: []
 # zephyr-7b-align-scan-0.0-0.2-polynomial-3
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Logits/chosen: -2.5251
+- Logits/rejected: -2.5078
+- Logps/chosen: -75.2992
+- Logps/rejected: -83.7654
+- Loss: 0.6467
 - Rewards/accuracies: 0.3313
+- Rewards/chosen: -0.1914
+- Rewards/margins: 0.4333
+- Rewards/rejected: -0.6247
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 7.526744872300726e-07
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 ### Training results
+| Training Loss | Epoch  | Step | Logits/chosen | Logits/rejected | Logps/chosen | Logps/rejected | Validation Loss | Rewards/accuracies | Rewards/chosen | Rewards/margins | Rewards/rejected |
+|:-------------:|:------:|:----:|:-------------:|:---------------:|:------------:|:--------------:|:---------------:|:------------------:|:--------------:|:---------------:|:----------------:|
+| 0.5502        | 1.0417 | 100  | -2.5290       | -2.5124         | -73.3173     | -81.0099       | 0.6403          | 0.3254             | 0.2781         | 0.2500          | 0.0281           |
+| 0.377         | 2.0833 | 200  | -2.5251       | -2.5078         | -75.2992     | -83.7654       | 0.6467          | 0.3313             | -0.1914        | 0.4333          | -0.6247          |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -14,9 +14,9 @@
     "eval_samples_per_second": 17.543,
     "eval_steps_per_second": 0.553,
     "total_flos": 0.0,
-    "train_loss": 0.4920133708251847,
-    "train_runtime": 3526.2726,
     "train_samples": 6113,
-    "train_samples_per_second": 5.201,
-    "train_steps_per_second": 0.082
 }

     "eval_samples_per_second": 17.543,
     "eval_steps_per_second": 0.553,
     "total_flos": 0.0,
+    "train_loss": 0.0,
+    "train_runtime": 0.0237,
     "train_samples": 6113,
+    "train_samples_per_second": 774670.33,
+    "train_steps_per_second": 12165.606
 }

config.json CHANGED Viewed

@@ -22,6 +22,6 @@
   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.44.2",
-  "use_cache": true,
   "vocab_size": 32000
 }

   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.44.2",
+  "use_cache": false,
   "vocab_size": 32000
 }

runs/Sep24_14-18-45_qa-a100-004.crc.nd.edu/events.out.tfevents.1727202087.qa-a100-004.crc.nd.edu.1892028.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:54068d221111f0d30718ebac97582b857e36f8cc2c89291b53b7f207ece8f67b
+size 6828

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 3.0,
     "total_flos": 0.0,
-    "train_loss": 0.4920133708251847,
-    "train_runtime": 3526.2726,
     "train_samples": 6113,
-    "train_samples_per_second": 5.201,
-    "train_steps_per_second": 0.082
 }

 {
     "epoch": 3.0,
     "total_flos": 0.0,
+    "train_loss": 0.0,
+    "train_runtime": 0.0237,
     "train_samples": 6113,
+    "train_samples_per_second": 774670.33,
+    "train_steps_per_second": 12165.606
 }

trainer_state.json CHANGED Viewed

@@ -479,10 +479,10 @@
       "epoch": 3.0,
       "step": 288,
       "total_flos": 0.0,
-      "train_loss": 0.4920133708251847,
-      "train_runtime": 3526.2726,
-      "train_samples_per_second": 5.201,
-      "train_steps_per_second": 0.082
     }
   ],
   "logging_steps": 10,

       "epoch": 3.0,
       "step": 288,
       "total_flos": 0.0,
+      "train_loss": 0.0,
+      "train_runtime": 0.0237,
+      "train_samples_per_second": 774670.33,
+      "train_steps_per_second": 12165.606
     }
   ],
   "logging_steps": 10,

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:492c64dc58da7ba228f8deac2fac8a1ed145c20f522d9c6826444661dccf8d7b
 size 7672

 version https://git-lfs.github.com/spec/v1
+oid sha256:d3768ab389c39deb7c4853b0e98898373bab62f7663a87fd250647e7bee6999f
 size 7672