gplsi
/

Toxicity_model

Text Classification

Model card Files Files and versions

amarmol commited on 21 days ago

Commit

a899994

·

verified ·

1 Parent(s): c49b2bc

Update README.md

Files changed (1) hide show

README.md +27 -2

README.md CHANGED Viewed

@@ -11,7 +11,32 @@ base_model:
 - PlanTL-GOB-ES/roberta-base-bne
 pipeline_tag: text-classification
 library_name: transformers
 ---
-Comment classification model according to their toxicity.
-This model has been obtained by fine-tuning the RoBERTa language model in Spanish (https://huggingface.co/PlanTL-GOB-ES/roberta-base-bne)

 - PlanTL-GOB-ES/roberta-base-bne
 pipeline_tag: text-classification
 library_name: transformers
+datasets:
+- gplsi/SocialTOX
 ---
+# 🧠 Toxicity_model_RoBERTa-base-bne– Spanish Toxicity Classifier Multiclass (Fine-tuned)
+## 📌 Model Description
+This model is a fine-tuned version** of `RoBERTa-base-bne`, specifically trained to classify the toxicity level of **Spanish-language user comments on news articles**. It distinguishes between two categories:
+- **Non-toxic**
+- **Slightly toxic**
+- **Toxic**
+---
+## 📂 Training Data
+The model was fine-tuned on the **[SocialTOX dataset](https://huggingface.co/datasets/gplsi/SocialTOX)**, a collection of Spanish-language comments annotated for varying levels of toxicity. These comments come from news platforms and represent real-world scenarios of online discourse. In this case, a Binary classifier was developed, where the classes \textit{Slightly toxic} and \textit{Toxic} were merged into a single \textit{Toxic} category.
+---
+## Training hyperparameters
+- epochs: 7
+- learning_rate: 1.51E-06
+- Adam_epsilon: 2.80E-08
+- weight_decay: 3.88E-12
+- batch_size: 16
+- max_seq_length: 512