Abusive and Hate speech Classification in Arabic Text Using Pre-trained Language Models and Data Augmentation

N Badri, F Kboubi, A Habacha Chaibi - ACM Transactions on Asian and …, 2024 - dl.acm.org
Hateful content on social media is a worldwide problem that adversely affects not just the
targeted individuals but also anyone whose content is accessible. The majority of studies …

EgyBERT: A Large Language Model Pretrained on Egyptian Dialect Corpora

F Qarah - arXiv preprint arXiv:2408.03524, 2024 - arxiv.org
This study presents EgyBERT, an Arabic language model pretrained on 10.4 GB of Egyptian
dialectal texts. We evaluated EgyBERT's performance by comparing it with five other …

A Survey of Large Language Models for Arabic Language and its Dialects

M Mashaabi, S Al-Khalifa, H Al-Khalifa - arXiv preprint arXiv:2410.20238, 2024 - arxiv.org
This survey offers a comprehensive overview of Large Language Models (LLMs) designed
for Arabic language and its dialects. It covers key architectures, including encoder-only …

SaudiBERT: A Large Language Model Pretrained on Saudi Dialect Corpora

F Qarah - arXiv preprint arXiv:2405.06239, 2024 - arxiv.org
In this paper, we introduce SaudiBERT, a monodialect Arabic language model pretrained
exclusively on Saudi dialectal text. To demonstrate the model's effectiveness, we compared …

AfriDial: African Dialect Model based on Deep Learning for Sentiment Analysis

A Sassi, J Tonga, S Poaty, S Steve… - 2024 International …, 2024 - ieeexplore.ieee.org
This paper presents the African Dialect Dataset for Sentiment Analysis, a new natural
language processing dataset (AfriDial). This dataset is intended to aid in the classification of …

Features and Methods

T Jauhiainen, M Zampieri, T Baldwin… - … Language Identification in …, 2024 - Springer
In addition to features and methods used in LI, this chapter introduces the notation devised
by Jauhiainen et al. that is used throughout this book to describe LI methods. For easier …