Description

polarity3c is a classification model that is specialized for determining the polarity of texts from news portals. It was learned mostly on texts in Polish.

Annotations from the plWordnet were used as the basis for the data. A pre-learned model on these annotations, served as the model in Human in the loop, to support the annotations for teaching the final model. The final model was learned on web content; data was manually collected and annotated.

As a model base, the sdadas/polish-roberta-large-v2 model was used with a classification head. More about model construction is on our blog.

Architecture

RobertaForSequenceClassification(
  (roberta): RobertaModel(
    (embeddings): RobertaEmbeddings(
      (word_embeddings): Embedding(128001, 1024, padding_idx=1)
      (position_embeddings): Embedding(514, 1024, padding_idx=1)
      (token_type_embeddings): Embedding(1, 1024)
      (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): RobertaEncoder(
      (layer): ModuleList(
        (0-23): 24 x RobertaLayer(
          (attention): RobertaAttention(
            (self): RobertaSdpaSelfAttention(
              (query): Linear(in_features=1024, out_features=1024, bias=True)
              (key): Linear(in_features=1024, out_features=1024, bias=True)
              (value): Linear(in_features=1024, out_features=1024, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): RobertaSelfOutput(
              (dense): Linear(in_features=1024, out_features=1024, bias=True)
              (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): RobertaIntermediate(
            (dense): Linear(in_features=1024, out_features=4096, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): RobertaOutput(
            (dense): Linear(in_features=4096, out_features=1024, bias=True)
            (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
  )
  (classifier): RobertaClassificationHead(
    (dense): Linear(in_features=1024, out_features=1024, bias=True)
    (dropout): Dropout(p=0.1, inplace=False)
    (out_proj): Linear(in_features=1024, out_features=3, bias=True)
  )
)

Usage

Example of use with transformers pipeline:

from transformers import pipeline

classifier = pipeline(model="radlab/polarity-3c", task="text-classification")

classifier("Text to classification")

with sample data and top_k=3:

classifier("""
  Po upadku reżimu Asada w Syrii, mieszkańcy, borykający się z ubóstwem,
  zaczęli tłumnie poszukiwać skarbów, zachęceni legendami o zakopanych
  bogactwach i dostępnością wykrywaczy metali, które stały się popularnym
  towarem. Mimo, że działalność ta jest nielegalna, rząd przymyka oko,
  a sprzedawcy oferują urządzenia nawet dla dzieci. Poszukiwacze skupiają
  się na obszarach historycznych, wierząc w legendy o skarbach ukrytych
  przez starożytne cywilizacje i wojska osmańskie, choć eksperci ostrzegają
  przed fałszywymi monetami i kradzieżą artefaktów z muzeów.""",
  top_k=3
)

the output is:

[{'label': 'ambivalent', 'score': 0.9995126724243164},
 {'label': 'negative', 'score': 0.00024663121439516544},
 {'label': 'positive', 'score': 0.00024063512682914734}]
Downloads last month
1
Safetensors
Model size
435M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for radlab/polarity-3c

Finetuned
(2)
this model

Collection including radlab/polarity-3c