ke-t5-math-py / README.md

End of training

fc1eac8 about 3 years ago

3.18 kB

	---
	tags:
	- generated_from_keras_callback
	model-index:
	- name: madatnlp/ke-t5-math-py
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# madatnlp/ke-t5-math-py

	This model is a fine-tuned version of [KETI-AIR/ke-t5-base-ko](https://huggingface.co/KETI-AIR/ke-t5-base-ko) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 0.1203
	- Validation Loss: 0.4336
	- Epoch: 47

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'Adam', 'learning_rate': 0.001, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
	- training_precision: float32

	### Training results

	\| Train Loss \| Validation Loss \| Epoch \|
	\|:----------:\|:---------------:\|:-----:\|
	\| 2.0197 \| 1.2886 \| 0 \|
	\| 1.5642 \| 1.1261 \| 1 \|
	\| 1.3713 \| 1.0296 \| 2 \|
	\| 1.2555 \| 0.9905 \| 3 \|
	\| 1.1708 \| 0.9628 \| 4 \|
	\| 1.1161 \| 0.9133 \| 5 \|
	\| 1.0704 \| 0.8994 \| 6 \|
	\| 1.0297 \| 0.8911 \| 7 \|
	\| 0.9898 \| 0.8570 \| 8 \|
	\| 0.9608 \| 0.8497 \| 9 \|
	\| 0.9326 \| 0.8359 \| 10 \|
	\| 0.9089 \| 0.8387 \| 11 \|
	\| 0.8882 \| 0.8083 \| 12 \|
	\| 0.8627 \| 0.8154 \| 13 \|
	\| 0.8467 \| 0.8058 \| 14 \|
	\| 0.8314 \| 0.7905 \| 15 \|
	\| 0.8071 \| 0.7852 \| 16 \|
	\| 0.7975 \| 0.7873 \| 17 \|
	\| 0.8021 \| 0.7926 \| 18 \|
	\| 0.7754 \| 0.7858 \| 19 \|
	\| 0.7598 \| 0.7941 \| 20 \|
	\| 0.7463 \| 0.7769 \| 21 \|
	\| 0.7266 \| 0.7594 \| 22 \|
	\| 0.7092 \| 0.7744 \| 23 \|
	\| 0.6986 \| 0.7611 \| 24 \|
	\| 0.6818 \| 0.7592 \| 25 \|
	\| 0.6775 \| 0.7718 \| 26 \|
	\| 0.6689 \| 0.7685 \| 27 \|
	\| 0.6474 \| 0.7554 \| 28 \|
	\| 0.6328 \| 0.7601 \| 29 \|
	\| 0.6050 \| 0.7042 \| 30 \|
	\| 0.5296 \| 0.5711 \| 31 \|
	\| 0.4310 \| 0.5227 \| 32 \|
	\| 0.3729 \| 0.4740 \| 33 \|
	\| 0.3353 \| 0.4552 \| 34 \|
	\| 0.3006 \| 0.4375 \| 35 \|
	\| 0.2750 \| 0.4233 \| 36 \|
	\| 0.2494 \| 0.4487 \| 37 \|
	\| 0.2287 \| 0.4294 \| 38 \|
	\| 0.2160 \| 0.4119 \| 39 \|
	\| 0.1980 \| 0.4309 \| 40 \|
	\| 0.1837 \| 0.4182 \| 41 \|
	\| 0.1699 \| 0.4045 \| 42 \|
	\| 0.1577 \| 0.4065 \| 43 \|
	\| 0.1498 \| 0.4247 \| 44 \|
	\| 0.1392 \| 0.4102 \| 45 \|
	\| 0.1282 \| 0.4274 \| 46 \|
	\| 0.1203 \| 0.4336 \| 47 \|


	### Framework versions

	- Transformers 4.18.0
	- TensorFlow 2.8.0
	- Datasets 2.1.0
	- Tokenizers 0.12.1

	---
	tags:
	- generated_from_keras_callback
	model-index:
	- name: madatnlp/ke-t5-math-py
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# madatnlp/ke-t5-math-py

	This model is a fine-tuned version of [KETI-AIR/ke-t5-base-ko](https://huggingface.co/KETI-AIR/ke-t5-base-ko) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Train Loss: 0.1203
	- Validation Loss: 0.4336
	- Epoch: 47

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'Adam', 'learning_rate': 0.001, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
	- training_precision: float32

	### Training results

	\| Train Loss \| Validation Loss \| Epoch \|
	\|:----------:\|:---------------:\|:-----:\|
	\| 2.0197 \| 1.2886 \| 0 \|
	\| 1.5642 \| 1.1261 \| 1 \|
	\| 1.3713 \| 1.0296 \| 2 \|
	\| 1.2555 \| 0.9905 \| 3 \|
	\| 1.1708 \| 0.9628 \| 4 \|
	\| 1.1161 \| 0.9133 \| 5 \|
	\| 1.0704 \| 0.8994 \| 6 \|
	\| 1.0297 \| 0.8911 \| 7 \|
	\| 0.9898 \| 0.8570 \| 8 \|
	\| 0.9608 \| 0.8497 \| 9 \|
	\| 0.9326 \| 0.8359 \| 10 \|
	\| 0.9089 \| 0.8387 \| 11 \|
	\| 0.8882 \| 0.8083 \| 12 \|
	\| 0.8627 \| 0.8154 \| 13 \|
	\| 0.8467 \| 0.8058 \| 14 \|
	\| 0.8314 \| 0.7905 \| 15 \|
	\| 0.8071 \| 0.7852 \| 16 \|
	\| 0.7975 \| 0.7873 \| 17 \|
	\| 0.8021 \| 0.7926 \| 18 \|
	\| 0.7754 \| 0.7858 \| 19 \|
	\| 0.7598 \| 0.7941 \| 20 \|
	\| 0.7463 \| 0.7769 \| 21 \|
	\| 0.7266 \| 0.7594 \| 22 \|
	\| 0.7092 \| 0.7744 \| 23 \|
	\| 0.6986 \| 0.7611 \| 24 \|
	\| 0.6818 \| 0.7592 \| 25 \|
	\| 0.6775 \| 0.7718 \| 26 \|
	\| 0.6689 \| 0.7685 \| 27 \|
	\| 0.6474 \| 0.7554 \| 28 \|
	\| 0.6328 \| 0.7601 \| 29 \|
	\| 0.6050 \| 0.7042 \| 30 \|
	\| 0.5296 \| 0.5711 \| 31 \|
	\| 0.4310 \| 0.5227 \| 32 \|
	\| 0.3729 \| 0.4740 \| 33 \|
	\| 0.3353 \| 0.4552 \| 34 \|
	\| 0.3006 \| 0.4375 \| 35 \|
	\| 0.2750 \| 0.4233 \| 36 \|
	\| 0.2494 \| 0.4487 \| 37 \|
	\| 0.2287 \| 0.4294 \| 38 \|
	\| 0.2160 \| 0.4119 \| 39 \|
	\| 0.1980 \| 0.4309 \| 40 \|
	\| 0.1837 \| 0.4182 \| 41 \|
	\| 0.1699 \| 0.4045 \| 42 \|
	\| 0.1577 \| 0.4065 \| 43 \|
	\| 0.1498 \| 0.4247 \| 44 \|
	\| 0.1392 \| 0.4102 \| 45 \|
	\| 0.1282 \| 0.4274 \| 46 \|
	\| 0.1203 \| 0.4336 \| 47 \|


	### Framework versions

	- Transformers 4.18.0
	- TensorFlow 2.8.0
	- Datasets 2.1.0
	- Tokenizers 0.12.1