Natural language adversarial attack and defense in word level

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption,
with machine learning systems demonstrating superhuman performance in a significant …

被引用次数：2040 相关文章所有 12 个版本

[PDF] sagepub.com Full View

Machine learning in cybersecurity: a comprehensive survey

D Dasgupta, Z Akhtar, S Sen - The Journal of Defense …, 2022 - journals.sagepub.com

Today's world is highly network interconnected owing to the pervasiveness of small personal
devices (eg, smartphones) as well as large computing devices or services (eg, cloud …

被引用次数：172 相关文章所有 4 个版本

[PDF] arxiv.org

Smoothllm: Defending large language models against jailbreaking attacks

A Robey, E Wong, H Hassani, GJ Pappas - arXiv preprint arXiv …, 2023 - arxiv.org

Despite efforts to align large language models (LLMs) with human values, widely-used
LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks …

被引用次数：98 相关文章所有 4 个版本

[PDF] arxiv.org

Word-level textual adversarial attacking as combinatorial optimization

Y Zang, F Qi, C Yang, Z Liu, M Zhang, Q Liu… - arXiv preprint arXiv …, 2019 - arxiv.org

Adversarial attacks are carried out to reveal the vulnerability of deep neural networks.
Textual adversarial attacking is challenging because text is discrete and a small perturbation …

被引用次数：399 相关文章所有 7 个版本

[PDF] arxiv.org

Measure and improve robustness in NLP models: A survey

X Wang, H Wang, D Yang - arXiv preprint arXiv:2112.08313, 2021 - arxiv.org

As NLP models achieved state-of-the-art performances over benchmarks and gained wide
applications, it has been increasingly important to ensure the safe deployment of these …

被引用次数：105 相关文章所有 7 个版本

[PDF] arxiv.org

Contextualized perturbation for textual adversarial attack

D Li, Y Zhang, H Peng, L Chen, C Brockett… - arXiv preprint arXiv …, 2020 - arxiv.org

Adversarial examples expose the vulnerabilities of natural language processing (NLP)
models, and can be used to evaluate and improve their robustness. Existing techniques of …

被引用次数：216 相关文章所有 5 个版本

Adversarial attack and defense technologies in natural language processing: A survey

S Qiu, Q Liu, S Zhou, W Huang - Neurocomputing, 2022 - Elsevier

Recently, the adversarial attack and defense technology has made remarkable
achievements and has been widely applied in the computer vision field, promoting its rapid …

被引用次数：56 相关文章所有 2 个版本

Towards a robust deep neural network against adversarial texts: A survey

W Wang, R Wang, L Wang, Z Wang… - ieee transactions on …, 2021 - ieeexplore.ieee.org

Deep neural networks (DNNs) have achieved remarkable success in various tasks (eg,
image classification, speech recognition, and natural language processing (NLP)). However …

被引用次数：55 相关文章所有 2 个版本

[PDF] aaai.org

Adversarial training with fast gradient projection method against synonym substitution based text attacks

X Wang, Y Yang, Y Deng, K He - … of the AAAI conference on artificial …, 2021 - ojs.aaai.org

Adversarial training is the most empirically successful approach in improving the robustness
of deep neural networks for image classification. For text classification, however, existing …

被引用次数：72 相关文章所有 7 个版本

[PDF] arxiv.org

Improving the adversarial robustness of NLP models by information bottleneck

C Zhang, X Zhou, Y Wan, X Zheng, KW Chang… - arXiv preprint arXiv …, 2022 - arxiv.org

Existing studies have demonstrated that adversarial examples can be directly attributed to
the presence of non-robust features, which are highly predictive, but can be easily …

被引用次数：27 相关文章所有 3 个版本

高级搜索

QQ 群