calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0853

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.9446 1.0 6 2.2356
1.9974 2.0 12 1.7231
1.5322 3.0 18 1.3437
1.2026 4.0 24 1.0972
1.0097 5.0 30 0.9458
0.9242 6.0 36 0.9294
0.8364 7.0 42 0.7825
0.7532 8.0 48 0.7002
0.6887 9.0 54 0.6662
0.649 10.0 60 0.6007
0.6128 11.0 66 0.5712
0.5774 12.0 72 0.5336
0.538 13.0 78 0.5130
0.5073 14.0 84 0.4805
0.482 15.0 90 0.4615
0.4972 16.0 96 0.4641
0.4565 17.0 102 0.4172
0.4232 18.0 108 0.3793
0.4058 19.0 114 0.3745
0.3859 20.0 120 0.3502
0.3611 21.0 126 0.3364
0.341 22.0 132 0.2922
0.3166 23.0 138 0.2788
0.2848 24.0 144 0.2481
0.2663 25.0 150 0.2436
0.2558 26.0 156 0.2127
0.2335 27.0 162 0.1852
0.2041 28.0 168 0.1597
0.183 29.0 174 0.1454
0.172 30.0 180 0.1328
0.1634 31.0 186 0.1228
0.1518 32.0 192 0.1140
0.1483 33.0 198 0.1073
0.1429 34.0 204 0.1027
0.1344 35.0 210 0.0978
0.1299 36.0 216 0.0929
0.1298 37.0 222 0.0899
0.1247 38.0 228 0.0917
0.1253 39.0 234 0.0869
0.1141 40.0 240 0.0853

Framework versions

  • Transformers 4.48.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
7.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support