[PDF][PDF] NepBERTa: Nepali language model trained in a large corpus

S Timilsina, M Gautam, B Bhattarai - … of the 2nd conference of the …, 2022 - aura.abdn.ac.uk
Nepali is a low-resource language with more than 40 million speakers worldwide. It is
written in Devnagari script and has rich semantics and complex grammatical structure. To …

[PDF][PDF] Morphology-Assisted Sindhi Text Analysis for Natural Language Processing Applications

IN Sodhar, S Sulaiman… - Indian Journal …, 2023 - sciresol.s3.us-east-2.amazonaws …
Objectives: Understanding word construction and internal structure, especially in the Sindhi
language, requires knowledge of the linguistic field known as morphology. In this study …

A fully automated and scalable Parallel Data Augmentation for Low Resource Languages using image and text analytics

P Sharma, N Goyal, P Goyal - Proceedings of the 38th ACM/SIGAPP …, 2023 - dl.acm.org
Linguistic diversity across the world creates a disparity with the availability of good quality
digital language resources thereby restricting the technological benefits to majority of human …

English to Konkani Translator Using Hindi as a Pivot Language

BS Kamath, CK DN, AM Pai… - … on Recent Advances …, 2023 - ieeexplore.ieee.org
This paper presents a novel approach for converting English speech to Konkani speech by
utilizing the Hindi language as a pivot language. Our system employs a Python-based …

[PDF][PDF] Enhancing Text Classification in Low-Resource Languages: A Modified TF-IDF Approach for Effective Sindhi Text Categorization

M Hamza, MO Raza - uow.edu.pk
This study seeks to address the problem of text classification in low-resource languages,
with a particular focus on Sindhi. The main goal is to create an effective classification model …