calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9446	1.0	6	2.2356
1.9974	2.0	12	1.7231
1.5322	3.0	18	1.3437
1.2026	4.0	24	1.0972
1.0097	5.0	30	0.9458
0.9242	6.0	36	0.9294
0.8364	7.0	42	0.7825
0.7532	8.0	48	0.7002
0.6887	9.0	54	0.6662
0.649	10.0	60	0.6007
0.6128	11.0	66	0.5712
0.5774	12.0	72	0.5336
0.538	13.0	78	0.5130
0.5073	14.0	84	0.4805
0.482	15.0	90	0.4615
0.4972	16.0	96	0.4641
0.4565	17.0	102	0.4172
0.4232	18.0	108	0.3793
0.4058	19.0	114	0.3745
0.3859	20.0	120	0.3502
0.3611	21.0	126	0.3364
0.341	22.0	132	0.2922
0.3166	23.0	138	0.2788
0.2848	24.0	144	0.2481
0.2663	25.0	150	0.2436
0.2558	26.0	156	0.2127
0.2335	27.0	162	0.1852
0.2041	28.0	168	0.1597
0.183	29.0	174	0.1454
0.172	30.0	180	0.1328
0.1634	31.0	186	0.1228
0.1518	32.0	192	0.1140
0.1483	33.0	198	0.1073
0.1429	34.0	204	0.1027
0.1344	35.0	210	0.0978
0.1299	36.0	216	0.0929
0.1298	37.0	222	0.0899
0.1247	38.0	228	0.0917
0.1253	39.0	234	0.0869
0.1141	40.0	240	0.0853