gechim commited on
Commit
b37e135
·
verified ·
1 Parent(s): d644c81

End of training

Browse files
Files changed (4) hide show
  1. README.md +107 -0
  2. config.json +29 -0
  3. model.safetensors +3 -0
  4. training_args.bin +3 -0
README.md ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: FacebookAI/xlm-roberta-base
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - accuracy
8
+ - f1
9
+ model-index:
10
+ - name: XMLRoberta_70KURL
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # XMLRoberta_70KURL
18
+
19
+ This model is a fine-tuned version of [FacebookAI/xlm-roberta-base](https://huggingface.co/FacebookAI/xlm-roberta-base) on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.0047
22
+ - Accuracy: 0.9983
23
+ - F1: 0.9983
24
+
25
+ ## Model description
26
+
27
+ More information needed
28
+
29
+ ## Intended uses & limitations
30
+
31
+ More information needed
32
+
33
+ ## Training and evaluation data
34
+
35
+ More information needed
36
+
37
+ ## Training procedure
38
+
39
+ ### Training hyperparameters
40
+
41
+ The following hyperparameters were used during training:
42
+ - learning_rate: 2e-05
43
+ - train_batch_size: 64
44
+ - eval_batch_size: 64
45
+ - seed: 42
46
+ - gradient_accumulation_steps: 2
47
+ - total_train_batch_size: 128
48
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
+ - lr_scheduler_type: linear
50
+ - lr_scheduler_warmup_steps: 2150
51
+ - num_epochs: 20
52
+
53
+ ### Training results
54
+
55
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
56
+ |:-------------:|:-------:|:----:|:---------------:|:--------:|:------:|
57
+ | No log | 0.4651 | 200 | 0.4939 | 0.6704 | 0.5381 |
58
+ | No log | 0.9302 | 400 | 0.2605 | 0.9039 | 0.9028 |
59
+ | No log | 1.3953 | 600 | 0.2110 | 0.9241 | 0.9236 |
60
+ | No log | 1.8605 | 800 | 0.1589 | 0.9422 | 0.9425 |
61
+ | 0.3715 | 2.3256 | 1000 | 0.1458 | 0.9488 | 0.9489 |
62
+ | 0.3715 | 2.7907 | 1200 | 0.1282 | 0.9549 | 0.9549 |
63
+ | 0.3715 | 3.2558 | 1400 | 0.1106 | 0.9616 | 0.9616 |
64
+ | 0.3715 | 3.7209 | 1600 | 0.1033 | 0.9650 | 0.9651 |
65
+ | 0.1516 | 4.1860 | 1800 | 0.0973 | 0.9657 | 0.9657 |
66
+ | 0.1516 | 4.6512 | 2000 | 0.0872 | 0.9706 | 0.9708 |
67
+ | 0.1516 | 5.1163 | 2200 | 0.0743 | 0.9740 | 0.9741 |
68
+ | 0.1516 | 5.5814 | 2400 | 0.0702 | 0.9759 | 0.9760 |
69
+ | 0.1056 | 6.0465 | 2600 | 0.0651 | 0.9780 | 0.9781 |
70
+ | 0.1056 | 6.5116 | 2800 | 0.0565 | 0.9805 | 0.9806 |
71
+ | 0.1056 | 6.9767 | 3000 | 0.0550 | 0.9804 | 0.9805 |
72
+ | 0.1056 | 7.4419 | 3200 | 0.0516 | 0.9837 | 0.9837 |
73
+ | 0.1056 | 7.9070 | 3400 | 0.0649 | 0.9782 | 0.9784 |
74
+ | 0.0776 | 8.3721 | 3600 | 0.0371 | 0.9878 | 0.9878 |
75
+ | 0.0776 | 8.8372 | 3800 | 0.0333 | 0.9887 | 0.9887 |
76
+ | 0.0776 | 9.3023 | 4000 | 0.0317 | 0.9895 | 0.9895 |
77
+ | 0.0776 | 9.7674 | 4200 | 0.0296 | 0.9899 | 0.9900 |
78
+ | 0.0581 | 10.2326 | 4400 | 0.0259 | 0.9919 | 0.9919 |
79
+ | 0.0581 | 10.6977 | 4600 | 0.0235 | 0.9923 | 0.9923 |
80
+ | 0.0581 | 11.1628 | 4800 | 0.0204 | 0.9929 | 0.9929 |
81
+ | 0.0581 | 11.6279 | 5000 | 0.0172 | 0.9941 | 0.9941 |
82
+ | 0.042 | 12.0930 | 5200 | 0.0219 | 0.9929 | 0.9929 |
83
+ | 0.042 | 12.5581 | 5400 | 0.0147 | 0.9953 | 0.9953 |
84
+ | 0.042 | 13.0233 | 5600 | 0.0141 | 0.9952 | 0.9952 |
85
+ | 0.042 | 13.4884 | 5800 | 0.0108 | 0.9964 | 0.9964 |
86
+ | 0.042 | 13.9535 | 6000 | 0.0094 | 0.9967 | 0.9967 |
87
+ | 0.0314 | 14.4186 | 6200 | 0.0087 | 0.9969 | 0.9969 |
88
+ | 0.0314 | 14.8837 | 6400 | 0.0083 | 0.9971 | 0.9971 |
89
+ | 0.0314 | 15.3488 | 6600 | 0.0077 | 0.9973 | 0.9973 |
90
+ | 0.0314 | 15.8140 | 6800 | 0.0086 | 0.9972 | 0.9972 |
91
+ | 0.0246 | 16.2791 | 7000 | 0.0066 | 0.9979 | 0.9979 |
92
+ | 0.0246 | 16.7442 | 7200 | 0.0061 | 0.9977 | 0.9977 |
93
+ | 0.0246 | 17.2093 | 7400 | 0.0065 | 0.9979 | 0.9979 |
94
+ | 0.0246 | 17.6744 | 7600 | 0.0061 | 0.9980 | 0.9980 |
95
+ | 0.0183 | 18.1395 | 7800 | 0.0052 | 0.9981 | 0.9981 |
96
+ | 0.0183 | 18.6047 | 8000 | 0.0048 | 0.9983 | 0.9983 |
97
+ | 0.0183 | 19.0698 | 8200 | 0.0047 | 0.9983 | 0.9983 |
98
+ | 0.0183 | 19.5349 | 8400 | 0.0047 | 0.9983 | 0.9983 |
99
+ | 0.0145 | 20.0 | 8600 | 0.0047 | 0.9983 | 0.9983 |
100
+
101
+
102
+ ### Framework versions
103
+
104
+ - Transformers 4.41.2
105
+ - Pytorch 2.1.2
106
+ - Datasets 2.19.2
107
+ - Tokenizers 0.19.1
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "FacebookAI/xlm-roberta-base",
3
+ "architectures": [
4
+ "XLMRobertaForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "xlm-roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "output_past": true,
21
+ "pad_token_id": 1,
22
+ "position_embedding_type": "absolute",
23
+ "problem_type": "single_label_classification",
24
+ "torch_dtype": "float32",
25
+ "transformers_version": "4.41.2",
26
+ "type_vocab_size": 1,
27
+ "use_cache": true,
28
+ "vocab_size": 250002
29
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c60b2e18dfe6969c184cc1e9a6c2b079b1fc8b387f0f03ecd8754c9a7ee35429
3
+ size 1112205008
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d507eba1684da293a27d6fba69c56e04123d18981a4eebf72766e1bd0b1c1820
3
+ size 5048