As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1 …
Given that Arabic is one of the most widely used languages in the world, the task of Arabic Machine Translation (MT) has recently received a great deal of attention from the research …
THE TERM NATURAL language refers to any system of symbolic communication (spoken, signed, or written) that has evolved naturally in humans without intentional human planning …
Y Song, T Zhang, Y Wang, KF Lee - arXiv preprint arXiv:2105.01279, 2021 - arxiv.org
Pre-trained text encoders have drawn sustaining attention in natural language processing (NLP) and shown their capability in obtaining promising results in different tasks. Recent …
In recent years, computer language area has witnessed important evolvement with applications in different domains. Machine Translation MT technology, considered as a …
Background Corpora play a vital role when training machine learning (ML) models and building systems that use natural language processing (NLP). It can be challenging for …
I Bounhas, N Soudani, Y Slimani - Information Processing & Management, 2020 - Elsevier
In this paper, we propose to build a morpho-semantic knowledge graph from Arabic vocalized corpora. Our work focuses on classical Arabic as it has not been deeply …
R Kharsa, A Elnagar, S Yagi - Expert Systems with Applications, 2024 - Elsevier
In order to accurately represent the meaning and pronunciation of Arabic words and sentences, the presence of diacritics plays a crucial role. Over the years, researchers have …
H Qin, G Chen, Y Tian, Y Song - … of the 59th Annual Meeting of the …, 2021 - aclanthology.org
Arabic diacritization is a fundamental task for Arabic language processing. Previous studies have demonstrated that automatically generated knowledge can be helpful to this task …