XMLRoberta_70KURL / README.md
gechim's picture
End of training
3cf5f7e verified
metadata
license: mit
base_model: FacebookAI/xlm-roberta-base
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
model-index:
  - name: XMLRoberta_70KURL
    results: []

XMLRoberta_70KURL

This model is a fine-tuned version of FacebookAI/xlm-roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4150
  • Accuracy: 0.9408
  • F1: 0.9448

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2150
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Accuracy F1
No log 0.4651 200 0.4701 0.8955 0.8673
No log 0.9302 400 0.1893 0.9359 0.9368
No log 1.3953 600 0.2241 0.9128 0.9192
No log 1.8605 800 0.2777 0.8848 0.8984
0.382 2.3256 1000 0.1388 0.9504 0.9525
0.382 2.7907 1200 0.1028 0.9694 0.9701
0.382 3.2558 1400 0.1413 0.9557 0.9579
0.382 3.7209 1600 0.0929 0.9718 0.9722
0.1521 4.1860 1800 0.1008 0.9695 0.9702
0.1521 4.6512 2000 0.1939 0.9238 0.9306
0.1521 5.1163 2200 0.1550 0.9401 0.9443
0.1521 5.5814 2400 0.0813 0.9744 0.9750
0.1044 6.0465 2600 0.2088 0.9193 0.9267
0.1044 6.5116 2800 0.1343 0.9523 0.9548
0.1044 6.9767 3000 0.2172 0.9219 0.9289
0.1044 7.4419 3200 0.1097 0.9656 0.9668
0.1044 7.9070 3400 0.3044 0.9147 0.9230
0.0762 8.3721 3600 0.2122 0.9283 0.9341
0.0762 8.8372 3800 0.1430 0.9532 0.9556
0.0762 9.3023 4000 0.2332 0.9312 0.9368
0.0762 9.7674 4200 0.2167 0.9297 0.9353
0.0564 10.2326 4400 0.1904 0.9475 0.9506
0.0564 10.6977 4600 0.2916 0.9196 0.9270
0.0564 11.1628 4800 0.2317 0.9451 0.9484
0.0564 11.6279 5000 0.2430 0.9475 0.9503
0.042 12.0930 5200 0.4035 0.9248 0.9315
0.042 12.5581 5400 0.3060 0.9352 0.9398
0.042 13.0233 5600 0.2894 0.9359 0.9407
0.042 13.4884 5800 0.2804 0.9439 0.9474
0.042 13.9535 6000 0.2941 0.9456 0.9490
0.0316 14.4186 6200 0.3211 0.9424 0.9460
0.0316 14.8837 6400 0.3453 0.9371 0.9416
0.0316 15.3488 6600 0.2587 0.9548 0.9569
0.0316 15.8140 6800 0.3433 0.9432 0.9468
0.025 16.2791 7000 0.3454 0.9416 0.9455
0.025 16.7442 7200 0.2977 0.9450 0.9484
0.025 17.2093 7400 0.3622 0.9452 0.9486
0.025 17.6744 7600 0.3035 0.9550 0.9572
0.0196 18.1395 7800 0.3588 0.9464 0.9496
0.0196 18.6047 8000 0.3714 0.9467 0.9500
0.0196 19.0698 8200 0.4517 0.9341 0.9391
0.0196 19.5349 8400 0.4078 0.9411 0.9451
0.0148 20.0 8600 0.4150 0.9408 0.9448

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1