calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.645	1.0	6	3.1423
2.6617	2.0	12	2.2004
1.9491	3.0	18	1.7518
1.6288	4.0	24	1.6348
1.5541	5.0	30	1.5475
1.538	6.0	36	1.9365
1.6453	7.0	42	1.5644
1.5197	8.0	48	1.5235
1.4557	9.0	54	1.4558
1.3767	10.0	60	1.4314
1.3545	11.0	66	1.3279
1.288	12.0	72	1.2356
1.227	13.0	78	1.3249
1.2531	14.0	84	1.2027
1.1426	15.0	90	1.1148
1.0918	16.0	96	1.1311
1.0839	17.0	102	1.1813
1.0836	18.0	108	1.0659
1.0585	19.0	114	1.0008
0.9926	20.0	120	0.9494
0.9478	21.0	126	0.9661
0.9775	22.0	132	0.9673
0.934	23.0	138	0.9720
0.924	24.0	144	0.9773
0.9248	25.0	150	0.9146
0.9067	26.0	156	0.8718
0.8721	27.0	162	0.8443
0.8469	28.0	168	0.8218
0.8193	29.0	174	0.8011
0.8128	30.0	180	0.7859
0.8052	31.0	186	0.7834
0.7991	32.0	192	0.7729
0.7989	33.0	198	0.7604
0.8723	34.0	204	0.7481
0.7696	35.0	210	0.7421
0.7635	36.0	216	0.7335
0.7615	37.0	222	0.7248
0.7523	38.0	228	0.7232
0.7501	39.0	234	0.7160
0.7662	40.0	240	0.7094