matiss's picture
Update README.md
205aef0 verified
metadata
license: mit
datasets:
  - matiss/Latvian-Twitter-Eater-Corpus-Sentiment
language:
  - lv
base_model:
  - AiLab-IMCS-UL/lvbert
pipeline_tag: text-classification
tags:
  - sentiment

Latvian Twitter Sentiment Analysis

This is a BERT-base model trained on ~26,000 manually annotated tweets in Latvian from various sources for sentiment analysis.

Labels:
0 -> Neutral;
1 -> Positive;
2 -> Negative.

This sentiment analysis model has been integrated in this HF Space.

Example Pipeline

from transformers import pipeline
model_path = "matiss/Latvian-Twitter-Sentiment-Analysis"
sentiment_task = pipeline("sentiment-analysis", model=model_path, tokenizer=model_path)
sentiment_task("Man garšo pankūkas ar kotletēm")
[{'label': 'Positive', 'score': 0.9032208919525146}]

Corpora Used for Training


  • Twitēdiens - the Latvian Twitter Eater Corpus of ~5000 manually annotated food-related tweets.
  • Pinnis - ~ 7000 tweets from politicians and companies
  • Peisenieks - ~ 1000 general tweets with sentiment annotated by multiple annotators
  • Vīksna - ~ 4000 general tweets
  • Nicemanis - ~ 2000 general tweets
  • Špats - ~ 6000 general tweets

Publications

If you use this corpus or scripts, please cite the following paper:

Uga Sproģis and Matīss Rikters (2020). "What Can We Learn From Almost a Decade of Food Tweets." In Proceedings of the 9th Conference Human Language Technologies - The Baltic Perspective (Baltic HLT 2020) (2020).

@inproceedings{SprogisRikters2020BalticHLT,
    author = {Sproģis, Uga and Rikters, Matīss},
    booktitle={In Proceedings of the 9th Conference Human Language Technologies - The Baltic Perspective (Baltic HLT 2020)},
    title = {{What Can We Learn From Almost a Decade of Food Tweets}},
    address={Kaunas, Lithuania},
    year = {2020}
}