Spaces:
Running
Running
title: README | |
emoji: π | |
colorFrom: green | |
colorTo: red | |
sdk: static | |
pinned: false | |
**Text Classification datasets and models for Uktainian** | |
We release datasets and models for Ukrainian covering several classification domains: toxicity, NLI, and formality. | |
π° [Toloka BlogPost on Toxicity Classification in Ukrainian](https://toloka.ai/blog/toxicity-detection-for-non-mainstream-languages-why-we-still-need-human-labeled-data/) | |
**Corresponding papers** | |
**[2025]** *Part of SemEval2025 Emotion Detection Shared Task*; Daryna Dementieva, Nikolay Babakov, and Alexander Fraser. [EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian](https://arxiv.org/abs/2505.23297). arXiv preprint arXiv:2505.23297. | |
**[COLING2025]** Daryna Dementieva, Valeriia Khylenko, and Georg Groh. 2025. [Cross-lingual Text Classification Transfer: The Case of Ukrainian](https://aclanthology.org/2025.coling-main.97/). In Proceedings of the 31st International Conference on Computational Linguistics, pages 1451β1464, Abu Dhabi, UAE. Association for Computational Linguistics. | |
**[NAACL2024, WOAH]** Daryna Dementieva, Valeriia Khylenko, Nikolay Babakov, and Georg Groh. 2024. [Toxicity Classification in Ukrainian](https://aclanthology.org/2024.woah-1.19/). In Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024), pages 244β255, Mexico City, Mexico. Association for Computational Linguistics. |