Combating misinformation in the age of llms: Opportunities and challenges

C Chen, K Shu - arXiv preprint arXiv:2311.05656, 2023 - arxiv.org
Misinformation such as fake news and rumors is a serious threat on information ecosystems
and public trust. The emergence of Large Language Models (LLMs) has great potential to …

Use of llms for illicit purposes: Threats, prevention measures, and vulnerabilities

M Mozes, X He, B Kleinberg, LD Griffin - arXiv preprint arXiv:2308.12833, 2023 - arxiv.org
Spurred by the recent rapid increase in the development and distribution of large language
models (LLMs) across industry and academia, much recent work has drawn attention to …

Detecting and understanding harmful memes: A survey

S Sharma, F Alam, MS Akhtar, D Dimitrov… - arXiv preprint arXiv …, 2022 - arxiv.org
The automatic identification of harmful content online is of major concern for social media
platforms, policymakers, and society. Researchers have studied textual, visual, and audio …

Risk taxonomy, mitigation, and assessment benchmarks of large language model systems

T Cui, Y Wang, C Fu, Y Xiao, S Li, X Deng, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have strong capabilities in solving diverse natural language
processing tasks. However, the safety and security issues of LLM systems have become the …

Emojis as anchors to detect arabic offensive language and hate speech

H Mubarak, S Hassan, SA Chowdhury - Natural Language …, 2023 - cambridge.org
We introduce a generic, language-independent method to collect a large percentage of
offensive and hate tweets regardless of their topics or genres. We harness the extralinguistic …

How does Twitter account moderation work? Dynamics of account creation and suspension on Twitter during major geopolitical events

F Pierri, L Luceri, E Chen, E Ferrara - EPJ Data Science, 2023 - epjds.epj.org
Social media moderation policies are often at the center of public debate, and their
implementation and enactment are sometimes surrounded by a veil of mystery …

Toxic language detection: A systematic review of Arabic datasets

I Bensalem, P Rosso, H Zitouni - Expert Systems, 2024 - Wiley Online Library
The detection of toxic language in the Arabic language has emerged as an active area of
research in recent years, and reviewing the existing datasets employed for training the …

Overview of the CLEF-2023 CheckThat! Lab Task 1 on Check-Worthiness of Multimodal and Multigenre Content

F Alam, A Barrón-Cedeño, GS Cheema, GK Shahi… - 2023 - dclibrary.mbzuai.ac.ae
We present an overview of CheckThat! Lab's 2023 Task 1, which is part of CLEF-2023. Task
1 asks to determine whether a text item, or a text coupled with an image, is check-worthy …

Overview of the CLEF-2024 CheckThat! lab task 1 on check-worthiness estimation of multigenre content

M Hasanain, R Suwaileh, S Weering, C Li… - 25th Working Notes of …, 2024 - research.rug.nl
We present an overview of the CheckThat! Lab 2024 Task 1, part of CLEF 2024. Task 1
involves determining whether a text item is check-worthy, with a special emphasis on COVID …

Counterfactually augmented data and unintended bias: The case of sexism and hate speech detection

I Sen, M Samory, C Wagner, I Augenstein - arXiv preprint arXiv …, 2022 - arxiv.org
Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an
indicator of model robustness. The improvement is credited with promoting core features of …