madatnlp
/

ke-t5-scratch

Text2Text Generation

Transformers

TensorFlow

generated_from_keras_callback

Model card Files Files and versions Community

madatnlp commited on May 8, 2022

Commit

3ade13e

1 Parent(s): ea3ec97

Training in progress epoch 0

Browse files

Files changed (2) hide show

README.md +5 -93
tf_model.h5 +1 -1

README.md CHANGED Viewed

@@ -13,9 +13,9 @@ probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [madatnlp/ke-t5-math-py](https://huggingface.co/madatnlp/ke-t5-math-py) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Train Loss: 1.8367
-- Validation Loss: 1.5850
-- Epoch: 88
 ## Model description
@@ -34,102 +34,14 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- optimizer: {'name': 'Adam', 'learning_rate': 1e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
 - training_precision: float32
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
-| 13.5076    | 11.8125         | 0     |
-| 11.0983    | 9.4857          | 1     |
-| 9.4413     | 7.9593          | 2     |
-| 8.2675     | 6.9802          | 3     |
-| 7.3769     | 6.1898          | 4     |
-| 6.6978     | 5.6209          | 5     |
-| 6.2266     | 5.1054          | 6     |
-| 5.7871     | 4.9395          | 7     |
-| 5.4937     | 4.6256          | 8     |
-| 5.2013     | 4.4694          | 9     |
-| 4.9649     | 4.1716          | 10    |
-| 4.7273     | 4.0317          | 11    |
-| 4.5237     | 3.7622          | 12    |
-| 4.3581     | 3.4826          | 13    |
-| 4.2078     | 3.4463          | 14    |
-| 4.0755     | 3.2685          | 15    |
-| 3.9494     | 3.1492          | 16    |
-| 3.8338     | 3.1535          | 17    |
-| 3.6767     | 2.8725          | 18    |
-| 3.6546     | 3.1201          | 19    |
-| 3.5395     | 3.0338          | 20    |
-| 3.4086     | 2.9991          | 21    |
-| 3.3886     | 2.8730          | 22    |
-| 3.2900     | 2.8334          | 23    |
-| 3.2906     | 2.6087          | 24    |
-| 3.1844     | 2.6765          | 25    |
-| 3.1672     | 2.6972          | 26    |
-| 3.1023     | 2.5778          | 27    |
-| 3.0528     | 2.5352          | 28    |
-| 2.9885     | 2.5250          | 29    |
-| 2.9455     | 2.6048          | 30    |
-| 2.9025     | 2.3874          | 31    |
-| 2.9228     | 2.4521          | 32    |
-| 2.8160     | 2.2810          | 33    |
-| 2.7895     | 2.3317          | 34    |
-| 2.7372     | 2.3300          | 35    |
-| 2.7494     | 2.3160          | 36    |
-| 2.7219     | 2.3736          | 37    |
-| 2.6818     | 2.3031          | 38    |
-| 2.6464     | 2.2736          | 39    |
-| 2.5834     | 2.2104          | 40    |
-| 2.5779     | 2.0641          | 41    |
-| 2.5577     | 2.0439          | 42    |
-| 2.5212     | 2.0828          | 43    |
-| 2.5029     | 2.1416          | 44    |
-| 2.4391     | 2.0837          | 45    |
-| 2.4556     | 2.0950          | 46    |
-| 2.4138     | 1.8874          | 47    |
-| 2.4138     | 1.9967          | 48    |
-| 2.3698     | 2.0096          | 49    |
-| 2.3776     | 1.9152          | 50    |
-| 2.3011     | 2.0284          | 51    |
-| 2.3454     | 2.0002          | 52    |
-| 2.2767     | 1.9544          | 53    |
-| 2.2332     | 1.8651          | 54    |
-| 2.2900     | 1.9383          | 55    |
-| 2.2442     | 1.8779          | 56    |
-| 2.2183     | 1.8790          | 57    |
-| 2.1824     | 1.7470          | 58    |
-| 2.1648     | 1.7715          | 59    |
-| 2.1859     | 1.8188          | 60    |
-| 2.1529     | 1.7747          | 61    |
-| 2.1343     | 1.8870          | 62    |
-| 2.1344     | 1.8471          | 63    |
-| 2.0876     | 1.8135          | 64    |
-| 2.0775     | 1.7311          | 65    |
-| 2.0557     | 1.8648          | 66    |
-| 2.1017     | 1.6826          | 67    |
-| 2.0649     | 1.7404          | 68    |
-| 2.0505     | 1.6182          | 69    |
-| 2.0084     | 1.6731          | 70    |
-| 2.0143     | 1.6890          | 71    |
-| 1.9882     | 1.6767          | 72    |
-| 1.9759     | 1.5758          | 73    |
-| 1.9800     | 1.7079          | 74    |
-| 1.9602     | 1.6354          | 75    |
-| 1.9580     | 1.6015          | 76    |
-| 1.9401     | 1.5779          | 77    |
-| 1.9070     | 1.5071          | 78    |
-| 1.9304     | 1.5554          | 79    |
-| 1.8987     | 1.5434          | 80    |
-| 1.8927     | 1.6711          | 81    |
-| 1.9044     | 1.5399          | 82    |
-| 1.8664     | 1.5820          | 83    |
-| 1.8860     | 1.5097          | 84    |
-| 1.8043     | 1.5495          | 85    |
-| 1.8571     | 1.5327          | 86    |
-| 1.8285     | 1.5381          | 87    |
-| 1.8367     | 1.5850          | 88    |
 ### Framework versions

 This model is a fine-tuned version of [madatnlp/ke-t5-math-py](https://huggingface.co/madatnlp/ke-t5-math-py) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Train Loss: 8.1521
+- Validation Loss: 4.7300
+- Epoch: 0
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
 - training_precision: float32
 ### Training results
 | Train Loss | Validation Loss | Epoch |
 |:----------:|:---------------:|:-----:|
+| 8.1521     | 4.7300          | 0     |
 ### Framework versions

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3f2e0db0f096db0bf51fdec842ba8b0bf0de934e6568656e15f2221c79621a90
 size 831509840

 version https://git-lfs.github.com/spec/v1
+oid sha256:ad96d2d93701a0d3375d55436059f67f334c23e72aa4308c35b492eddfc7a1cb
 size 831509840