What a Creole Wants, What a Creole Needs

H Lent, K Ogueji, M de Lhoneux, O Ahia… - arXiv preprint arXiv …, 2022 - arxiv.org
In recent years, the natural language processing (NLP) community has given increased
attention to the disparity of efforts directed towards high-resource languages over low …

Online threats detection in hausa language

AY Zandam, FA Muhammad… - 4th Workshop on African …, 2023 - openreview.net
One of the widely used technological inventions is the Internet which gives rise to online
social media platforms such as Twitter and Facebook to proliferate. These platforms are …

Building Text and Speech Benchmark Datasets and Models for Low‐Resourced East African Languages: Experiences and Lessons

J Nakatumba‐Nabende, C Babirye… - Applied AI …, 2024 - Wiley Online Library
Africa has over 2000 languages; however, those languages are not well represented in the
existing natural language processing ecosystem. African languages lack essential digital …

Detection of Offensive and Threatening Online Content in a Low Resource Language

FM Adam, AY Zandam, I Inuwa-Dutse - arXiv preprint arXiv:2311.10541, 2023 - arxiv.org
Hausa is a major Chadic language, spoken by over 100 million people in Africa. However,
from a computational linguistic perspective, it is considered a low-resource language, with …

On language models for creoles

H Lent, E Bugliarello, M De Lhoneux, C Qiu… - arXiv preprint arXiv …, 2021 - arxiv.org
Creole languages such as Nigerian Pidgin English and Haitian Creole are under-resourced
and largely ignored in the NLP literature. Creoles typically result from the fusion of a foreign …

Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin

PJ Lin, M Scholman, M Saeed, V Demberg - arXiv preprint arXiv …, 2024 - arxiv.org
Nigerian Pidgin is an English-derived contact language and is traditionally an oral
language, spoken by approximately 100 million people. No orthographic standard has yet …

PII Detection in Low-Resource Languages Using Explainable Deep Learning Techniques

B Africano, G Mpungu, D Jjingo, G Marvin - Proceedings of the 2024 …, 2024 - dl.acm.org
Safeguarding Personally Identifiable Information (PII) in an increasingly interconnected
world presents intimidating challenges, particularly in low-resource languages like Luganda …

Named Entity Recognition for Setswana Language: A conditional Random Fields (CRF) Approach

B Okgetheng, G Malema - Proceedings of the 2023 7th International …, 2023 - dl.acm.org
Named Entity Recognition (NER) is a fundamental task in Natural Language Processing
(NLP) focused on identifying entities like individuals, organizations, and locations within text …

Taming Toxic Talk: Detecting Offensive Online Content in Hausa

FM Adam, AY Zandam, I Inuwa-Dutse - 2024 - researchsquare.com
Hausa, a major Chadic language spoken by over 100 million people in Africa, faces a
challenge in the digital age. While widely used, it is considered a low-resource language …

A Hidden Markov Model-Based Parts-of-Speech Tagger for Yoruba Language

O Toyin, CO Akinduyite - 2024 International Conference on …, 2024 - ieeexplore.ieee.org
Parts-of-speech tagging is a linguistics task that assigns the best sequence of tags to a given
sequence of input words. The process falls under word sense disambiguation, which …