Papers
arxiv:2410.15539

Grammatical Error Correction for Low-Resource Languages: The Case of Zarma

Published on Oct 20, 2024
Authors:
,
,
,
,
,
,

Abstract

A study on grammatical error correction for Zarma and Bambara compares rule-based methods, machine translation models, and large language models, finding that MT models outperform the others in low-resource settings.

AI-generated summary

Grammatical error correction (GEC) aims to improve quality and readability of texts through accurate correction of linguistic mistakes. Previous work has focused on high-resource languages, while low-resource languages lack robust tools. However, low-resource languages often face problems such as: non-standard orthography, limited annotated corpora, and diverse dialects, which slows down the development of GEC tools. We present a study on GEC for Zarma, spoken by over five million in West Africa. We compare three approaches: rule-based methods, machine translation (MT) models, and large language models (LLMs). We evaluated them using a dataset of more than 250,000 examples, including synthetic and human-annotated data. Our results showed that the MT-based approach using M2M100 outperforms others, with a detection rate of 95. 82% and a suggestion accuracy of 78. 90% in automatic evaluations (AE) and an average score of 3.0 out of 5.0 in manual evaluation (ME) from native speakers for grammar and logical corrections. The rule-based method was effective for spelling errors but failed on complex context-level errors. LLMs -- MT5-small -- showed moderate performance. Our work supports use of MT models to enhance GEC in low-resource settings, and we validated these results with Bambara, another West African language.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2410.15539 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2410.15539 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.