rsepulvedat commited on
Commit
302fa43
·
verified ·
1 Parent(s): eea9fe1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - gplsi/SocialTOX
5
+ language:
6
+ - es
7
+ metrics:
8
+ - accuracy
9
+ - f1
10
+ - precision
11
+ - recall
12
+ base_model:
13
+ - BSC-LT/roberta-base-bne
14
+ pipeline_tag: text-classification
15
+ ---
16
+
17
+ # 🧠 Toxicity_model_RoBERTa-base-bne– Spanish Toxicity Classifier Binary (Fine-tuned)
18
+
19
+ ## 📌 Model Description
20
+
21
+ This model is a fine-tuned version** of `RoBERTa-base-bne`, specifically trained to classify the toxicity level of **Spanish-language user comments on news articles**. It distinguishes between tow categories:
22
+
23
+ - **Non-toxic**
24
+ - **Toxic**
25
+
26
+ The model follows instruction-based prompts and returns a single classification label in response.
27
+
28
+ ---
29
+
30
+ ## 📂 Training Data
31
+
32
+ The model was fine-tuned on the **[SocialTOX dataset](https://huggingface.co/datasets/gplsi/SocialTOX)**, a collection of Spanish-language comments annotated for varying levels of toxicity. These comments come from news platforms and represent real-world scenarios of online discourse. In this case, a Binary classifier was develop, where the classes \textit{Slightly toxic} and \textit{Toxic} were merged into a single \textit{Toxic} category.
33
+
34
+ ---
35
+
36
+ ## Training hyperparameters
37
+ - epochs: 10
38
+ - learning_rate: 2.45e-6
39
+ - beta1: 0.9
40
+ - beta2: 0.95
41
+ - Adam_epsilon: 1.00e-8
42
+ - weight_decay: 0
43
+ - batch_size: 16
44
+ - max_seq_length: 512