madatnlp commited on
Commit
3ade13e
·
1 Parent(s): ea3ec97

Training in progress epoch 0

Browse files
Files changed (2) hide show
  1. README.md +5 -93
  2. tf_model.h5 +1 -1
README.md CHANGED
@@ -13,9 +13,9 @@ probably proofread and complete it, then remove this comment. -->
13
 
14
  This model is a fine-tuned version of [madatnlp/ke-t5-math-py](https://huggingface.co/madatnlp/ke-t5-math-py) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
- - Train Loss: 1.8367
17
- - Validation Loss: 1.5850
18
- - Epoch: 88
19
 
20
  ## Model description
21
 
@@ -34,102 +34,14 @@ More information needed
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
- - optimizer: {'name': 'Adam', 'learning_rate': 1e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
38
  - training_precision: float32
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
- | 13.5076 | 11.8125 | 0 |
45
- | 11.0983 | 9.4857 | 1 |
46
- | 9.4413 | 7.9593 | 2 |
47
- | 8.2675 | 6.9802 | 3 |
48
- | 7.3769 | 6.1898 | 4 |
49
- | 6.6978 | 5.6209 | 5 |
50
- | 6.2266 | 5.1054 | 6 |
51
- | 5.7871 | 4.9395 | 7 |
52
- | 5.4937 | 4.6256 | 8 |
53
- | 5.2013 | 4.4694 | 9 |
54
- | 4.9649 | 4.1716 | 10 |
55
- | 4.7273 | 4.0317 | 11 |
56
- | 4.5237 | 3.7622 | 12 |
57
- | 4.3581 | 3.4826 | 13 |
58
- | 4.2078 | 3.4463 | 14 |
59
- | 4.0755 | 3.2685 | 15 |
60
- | 3.9494 | 3.1492 | 16 |
61
- | 3.8338 | 3.1535 | 17 |
62
- | 3.6767 | 2.8725 | 18 |
63
- | 3.6546 | 3.1201 | 19 |
64
- | 3.5395 | 3.0338 | 20 |
65
- | 3.4086 | 2.9991 | 21 |
66
- | 3.3886 | 2.8730 | 22 |
67
- | 3.2900 | 2.8334 | 23 |
68
- | 3.2906 | 2.6087 | 24 |
69
- | 3.1844 | 2.6765 | 25 |
70
- | 3.1672 | 2.6972 | 26 |
71
- | 3.1023 | 2.5778 | 27 |
72
- | 3.0528 | 2.5352 | 28 |
73
- | 2.9885 | 2.5250 | 29 |
74
- | 2.9455 | 2.6048 | 30 |
75
- | 2.9025 | 2.3874 | 31 |
76
- | 2.9228 | 2.4521 | 32 |
77
- | 2.8160 | 2.2810 | 33 |
78
- | 2.7895 | 2.3317 | 34 |
79
- | 2.7372 | 2.3300 | 35 |
80
- | 2.7494 | 2.3160 | 36 |
81
- | 2.7219 | 2.3736 | 37 |
82
- | 2.6818 | 2.3031 | 38 |
83
- | 2.6464 | 2.2736 | 39 |
84
- | 2.5834 | 2.2104 | 40 |
85
- | 2.5779 | 2.0641 | 41 |
86
- | 2.5577 | 2.0439 | 42 |
87
- | 2.5212 | 2.0828 | 43 |
88
- | 2.5029 | 2.1416 | 44 |
89
- | 2.4391 | 2.0837 | 45 |
90
- | 2.4556 | 2.0950 | 46 |
91
- | 2.4138 | 1.8874 | 47 |
92
- | 2.4138 | 1.9967 | 48 |
93
- | 2.3698 | 2.0096 | 49 |
94
- | 2.3776 | 1.9152 | 50 |
95
- | 2.3011 | 2.0284 | 51 |
96
- | 2.3454 | 2.0002 | 52 |
97
- | 2.2767 | 1.9544 | 53 |
98
- | 2.2332 | 1.8651 | 54 |
99
- | 2.2900 | 1.9383 | 55 |
100
- | 2.2442 | 1.8779 | 56 |
101
- | 2.2183 | 1.8790 | 57 |
102
- | 2.1824 | 1.7470 | 58 |
103
- | 2.1648 | 1.7715 | 59 |
104
- | 2.1859 | 1.8188 | 60 |
105
- | 2.1529 | 1.7747 | 61 |
106
- | 2.1343 | 1.8870 | 62 |
107
- | 2.1344 | 1.8471 | 63 |
108
- | 2.0876 | 1.8135 | 64 |
109
- | 2.0775 | 1.7311 | 65 |
110
- | 2.0557 | 1.8648 | 66 |
111
- | 2.1017 | 1.6826 | 67 |
112
- | 2.0649 | 1.7404 | 68 |
113
- | 2.0505 | 1.6182 | 69 |
114
- | 2.0084 | 1.6731 | 70 |
115
- | 2.0143 | 1.6890 | 71 |
116
- | 1.9882 | 1.6767 | 72 |
117
- | 1.9759 | 1.5758 | 73 |
118
- | 1.9800 | 1.7079 | 74 |
119
- | 1.9602 | 1.6354 | 75 |
120
- | 1.9580 | 1.6015 | 76 |
121
- | 1.9401 | 1.5779 | 77 |
122
- | 1.9070 | 1.5071 | 78 |
123
- | 1.9304 | 1.5554 | 79 |
124
- | 1.8987 | 1.5434 | 80 |
125
- | 1.8927 | 1.6711 | 81 |
126
- | 1.9044 | 1.5399 | 82 |
127
- | 1.8664 | 1.5820 | 83 |
128
- | 1.8860 | 1.5097 | 84 |
129
- | 1.8043 | 1.5495 | 85 |
130
- | 1.8571 | 1.5327 | 86 |
131
- | 1.8285 | 1.5381 | 87 |
132
- | 1.8367 | 1.5850 | 88 |
133
 
134
 
135
  ### Framework versions
 
13
 
14
  This model is a fine-tuned version of [madatnlp/ke-t5-math-py](https://huggingface.co/madatnlp/ke-t5-math-py) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - Train Loss: 8.1521
17
+ - Validation Loss: 4.7300
18
+ - Epoch: 0
19
 
20
  ## Model description
21
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - optimizer: {'name': 'Adam', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
38
  - training_precision: float32
39
 
40
  ### Training results
41
 
42
  | Train Loss | Validation Loss | Epoch |
43
  |:----------:|:---------------:|:-----:|
44
+ | 8.1521 | 4.7300 | 0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
 
47
  ### Framework versions
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3f2e0db0f096db0bf51fdec842ba8b0bf0de934e6568656e15f2221c79621a90
3
  size 831509840
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad96d2d93701a0d3375d55436059f67f334c23e72aa4308c35b492eddfc7a1cb
3
  size 831509840