Low-resource languages: A review of past work and future challenges

A Magueresse, V Carles, E Heetderks - arXiv preprint arXiv:2006.07264, 2020 - arxiv.org
A current problem in NLP is massaging and processing low-resource languages which lack
useful training attributes such as supervised data, number of native speakers or experts, etc …

Responsible media technology and AI: challenges and research directions

C Trattner, D Jannach, E Motta, I Costera Meijer… - AI and Ethics, 2022 - Springer
The last two decades have witnessed major disruptions to the traditional media industry as a
result of technological breakthroughs. New opportunities and challenges continue to arise …

Chatgpt perpetuates gender bias in machine translation and ignores non-gendered pronouns: Findings across bengali and five other low-resource languages

S Ghosh, A Caliskan - Proceedings of the 2023 AAAI/ACM Conference …, 2023 - dl.acm.org
In this multicultural age, language translation is one of the most performed tasks, and it is
becoming increasingly AI-moderated and automated. As a novel AI system, ChatGPT claims …

A survey on recent approaches for natural language processing in low-resource scenarios

MA Hedderich, L Lange, H Adel, J Strötgen… - arXiv preprint arXiv …, 2020 - arxiv.org
Deep neural networks and huge language models are becoming omnipresent in natural
language applications. As they are known for requiring large amounts of training data, there …

Participatory research for low-resourced machine translation: A case study in african languages

W Nekoto, V Marivate, T Matsila, T Fasubaa… - arXiv preprint arXiv …, 2020 - arxiv.org
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to
low-resourced languages has not yet been adequately solved." Low-resourced"-ness is a …

Reliability of electric vehicle charging infrastructure: A cross-lingual deep learning approach

Y Liu, A Francis, C Hollauer, MC Lawson… - Communications in …, 2023 - Elsevier
Vehicle electrification has emerged as a global strategy to address climate change and
emissions externalities from the transportation sector. Deployment of charging infrastructure …

Offensive language detection in Tamil YouTube comments by adapters and cross-domain knowledge transfer

M Subramanian, R Ponnusamy, S Benhur… - Computer Speech & …, 2022 - Elsevier
Over the past few years, researchers have been focusing on the identification of offensive
language on social networks. In places where English is not the primary language, social …

The low-resource double bind: An empirical study of pruning for low-resource machine translation

O Ahia, J Kreutzer, S Hooker - arXiv preprint arXiv:2110.03036, 2021 - arxiv.org
A" bigger is better" explosion in the number of parameters in deep neural networks has
made it increasingly challenging to make state-of-the-art networks accessible in compute …

“Bend the truth”: Benchmark dataset for fake news detection in Urdu language and its evaluation

M Amjad, G Sidorov, A Zhila… - Journal of Intelligent …, 2020 - content.iospress.com
The paper presents a new corpus for fake news detection in the Urdu language along with
the baseline classification and its evaluation. With the escalating use of the Internet …

A Warm Start and a Clean Crawled Corpus--A Recipe for Good Language Models

V Snæbjarnarson, HB Símonarson… - arXiv preprint arXiv …, 2022 - arxiv.org
We train several language models for Icelandic, including IceBERT, that achieve state-of-the-
art performance in a variety of downstream tasks, including part-of-speech tagging, named …