From hero to zéroe: A benchmark of low-level adversarial attacks

S Eger, Y Benz - Proceedings of the 1st conference of the Asia …, 2020 - aclanthology.org
Adversarial attacks are label-preserving modifications to inputs of machine learning
classifiers designed to fool machines but not humans. Natural Language Processing (NLP) …

How do humans perceive adversarial text? A reality check on the validity and naturalness of word-based adversarial attacks

S Dyrmishi, S Ghamizi, M Cordy - arXiv preprint arXiv:2305.15587, 2023 - arxiv.org
Natural Language Processing (NLP) models based on Machine Learning (ML) are
susceptible to adversarial attacks--malicious algorithms that imperceptibly modify input text …

Rethinking textual adversarial defense for pre-trained language models

J Wang, R Bao, Z Zhang, H Zhao - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Although pre-trained language models (PrLMs) have achieved significant success, recent
studies demonstrate that PrLMs are vulnerable to adversarial attacks. By generating …

Is bert really robust? a strong baseline for natural language attack on text classification and entailment

D Jin, Z Jin, JT Zhou, P Szolovits - Proceedings of the AAAI conference on …, 2020 - aaai.org
Abstract Machine learning algorithms are often vulnerable to adversarial examples that have
imperceptible alterations from the original counterparts but can fool the state-of-the-art …

Rmlm: A flexible defense framework for proactively mitigating word-level adversarial attacks

Z Wang, Z Liu, X Zheng, Q Su… - Proceedings of the 61st …, 2023 - aclanthology.org
Adversarial attacks on deep neural networks keep raising security concerns in natural
language processing research. Existing defenses focus on improving the robustness of the …

Generating natural language attacks in a hard label black box setting

R Maheshwary, S Maheshwary, V Pudi - Proceedings of the AAAI …, 2021 - ojs.aaai.org
We study an important and challenging task of attacking natural language processing
models in a hard label black box setting. We propose a decision-based attack strategy that …

Survey of vulnerabilities in large language models revealed by adversarial attacks

E Shayegani, MAA Mamun, Y Fu, P Zaree… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) are swiftly advancing in architecture and capability, and as
they integrate more deeply into complex systems, the urgency to scrutinize their security …

Adversarial attacks on deep-learning models in natural language processing: A survey

WE Zhang, QZ Sheng, A Alhazmi, C Li - ACM Transactions on Intelligent …, 2020 - dl.acm.org
With the development of high computational devices, deep neural networks (DNNs), in
recent years, have gained significant popularity in many Artificial Intelligence (AI) …

[PDF][PDF] Towards Semantics-and Domain-Aware Adversarial Attacks.

J Zhang, YC Huang, W Wu, MR Lyu - IJCAI, 2023 - ijcai.org
Abstract Language models are known to be vulnerable to textual adversarial attacks, which
add humanimperceptible perturbations to the input to mislead DNNs. It is thus imperative to …

Detecting word-level adversarial text attacks via SHapley additive exPlanations

L Huber, MA Kühn, E Mosca… - Proceedings of the 7th …, 2022 - aclanthology.org
State-of-the-art machine learning models are prone to adversarial attacks”:” Maliciously
crafted inputs to fool the model into making a wrong prediction, often with high confidence …