Andrazp commited on
Commit
4e4c162
·
1 Parent(s): ea668d5

Create README.md

Browse files

# English Hate Speech Classifier

This is a monolingual hate speech classifier for social media content in English language. The model was trained on a dataset of 103,190 YouTube comments and tested on a separately gathered test set of 20,554 YouTube comments. It is based on the BERT base pre-trained language model.

## Tokenizer

During training, the original tokenizer for BERT base language model was used. As such, it is recommended to use the same tokenizer during inference.

## Model output

The model classifies the input sentence into four distinct classes: appropriate, inappropriate, offensive and hateful.
The classes are encoded as follows:
* 0 - appropriate
* 1 - inappropriate
* 2 - offensive
* 3 - hateful

Files changed (1) hide show
  1. README.md +0 -0
README.md ADDED
File without changes