Cufe@ nlu of devanagari script languages 2025: Language identification using fasttext

M Ibrahim - Proceedings of the First Workshop on Challenges in …, 2025 - aclanthology.org
Abstract Language identification is a critical area of research within natural language
processing (NLP), particularly in multilingual contexts where accurate language detection …

Script-Agnostic Language Identification

M Agarwal, J Otten, A Anastasopoulos - arXiv preprint arXiv:2406.17901, 2024 - arxiv.org
Language identification is used as the first step in many data collection and crawling efforts
because it allows us to sort online text into language-specific buckets. However, many …

Improving natural language processing for under-served languages through increased training data diversity

LV Burchell - 2024 - era.ed.ac.uk
More and better data is often the most effective way to improve the quality of natural
language processing (NLP), with the highest-performing applications requiring terabytes of …

Enhancing Translation Systems for Low-Resourced Settings

MMI Alam - 2024 - search.proquest.com
There are around 7000 languages that are alive worldwide; among them, only 50-200
languages are well-resourced. In many regions of the world, there are languages and …