[HTML][HTML] Arabic natural language processing: An overview

I Guellil, H Saâdane, F Azouaou, B Gueni… - Journal of King Saud …, 2021 - Elsevier
Arabic is recognised as the 4th most used language of the Internet. Arabic has three main
varieties:(1) classical Arabic (CA),(2) Modern Standard Arabic (MSA),(3) Arabic Dialect (AD) …

The bigscience roots corpus: A 1.6 tb composite multilingual dataset

H Laurençon, L Saulnier, T Wang… - Advances in …, 2022 - proceedings.neurips.cc
As language models grow ever larger, the need for large-scale high-quality text datasets has
never been more pressing, especially in multilingual settings. The BigScience workshop, a 1 …

Arabic machine translation: A survey of the latest trends and challenges

MSH Ameur, F Meziane, A Guessoum - Computer Science Review, 2020 - Elsevier
Given that Arabic is one of the most widely used languages in the world, the task of Arabic
Machine Translation (MT) has recently received a great deal of attention from the research …

A panoramic survey of natural language processing in the Arab world

K Darwish, N Habash, M Abbas, H Al-Khalifa… - Communications of the …, 2021 - dl.acm.org
THE TERM NATURAL language refers to any system of symbolic communication (spoken,
signed, or written) that has evolved naturally in humans without intentional human planning …

Zen 2.0: Continue training and adaption for n-gram enhanced text encoders

Y Song, T Zhang, Y Wang, KF Lee - arXiv preprint arXiv:2105.01279, 2021 - arxiv.org
Pre-trained text encoders have drawn sustaining attention in natural language processing
(NLP) and shown their capability in obtaining promising results in different tasks. Recent …

Arabic machine translation: A survey with challenges and future directions

J Zakraoui, M Saleh, S Al-Maadeed, JM Alja'am - IEEE Access, 2021 - ieeexplore.ieee.org
In recent years, computer language area has witnessed important evolvement with
applications in different domains. Machine Translation MT technology, considered as a …

[HTML][HTML] Freely available Arabic corpora: A scoping review

A Ahmed, N Ali, M Alzubaidi, W Zaghouani… - Computer Methods and …, 2022 - Elsevier
Background Corpora play a vital role when training machine learning (ML) models and
building systems that use natural language processing (NLP). It can be challenging for …

Building a morpho-semantic knowledge graph for Arabic information retrieval

I Bounhas, N Soudani, Y Slimani - Information Processing & Management, 2020 - Elsevier
In this paper, we propose to build a morpho-semantic knowledge graph from Arabic
vocalized corpora. Our work focuses on classical Arabic as it has not been deeply …

BERT-Based Arabic Diacritization: A state-of-the-art approach for improving text accuracy and pronunciation

R Kharsa, A Elnagar, S Yagi - Expert Systems with Applications, 2024 - Elsevier
In order to accurately represent the meaning and pronunciation of Arabic words and
sentences, the presence of diacritics plays a crucial role. Over the years, researchers have …

Improving Arabic diacritization with regularized decoding and adversarial training

H Qin, G Chen, Y Tian, Y Song - … of the 59th Annual Meeting of the …, 2021 - aclanthology.org
Arabic diacritization is a fundamental task for Arabic language processing. Previous studies
have demonstrated that automatically generated knowledge can be helpful to this task …