Towards adversarially robust text classifiers by learning to reweight clean examples

N Shakeel, S Shakeel - Journal of Computational and …, 2022 - ojs.bonviewpress.com

Abstract Leave-One-Out (LOO) scores provide estimates of feature importance in neural
networks, for adversarial attacks. In this work, we present context-free word scores as a …

被引用次数：76 相关文章

[PDF] nsf.gov

Security risk and attacks in AI: A survey of security and privacy

MM Rahman, AS Arshi, MM Hasan… - 2023 IEEE 47th …, 2023 - ieeexplore.ieee.org

This survey paper provides an overview of the current state of AI attacks and risks for AI
security and privacy as artificial intelligence becomes more prevalent in various applications …

被引用次数：13 相关文章所有 4 个版本

[PDF] aclanthology.org

Evaluating the validity of word-level adversarial attacks with large language models

H Zhou, Z Wang, H Wang, D Chen, W Mu… - Findings of the …, 2024 - aclanthology.org

Deep neural networks exhibit vulnerability to word-level adversarial attacks in natural
language processing. Most of these attack methods adopt synonymous substitutions to …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

READ: Improving Relation Extraction from an ADversarial Perspective

D Li, W Hogan, J Shang - arXiv preprint arXiv:2404.02931, 2024 - arxiv.org

Recent works in relation extraction (RE) have achieved promising benchmark accuracy;
however, our adversarial attack experiments show that these works excessively rely on …

被引用次数：1 相关文章所有 4 个版本

[PDF] aclanthology.org

[MASK] Insertion: a robust method for anti-adversarial attacks

X Hu, C Xu, J Ma, Z Huang, J Yang, Y Guo… - Findings of the …, 2023 - aclanthology.org

Adversarial attack aims to perturb input sequences and mislead a trained model for false
predictions. To enhance the model robustness, defensing methods are accordingly …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Defensive Dual Masking for Robust Adversarial Defense

W Yang, J Yang, Y Guo, J Barthelemy - arXiv preprint arXiv:2412.07078, 2024 - arxiv.org

The field of textual adversarial defenses has gained considerable attention in recent years
due to the increasing vulnerability of natural language processing (NLP) models to …

Identifying adversarially attackable and robust samples

V Raina, M Gales - arXiv preprint arXiv:2301.12896, 2023 - arxiv.org

Adversarial attacks insert small, imperceptible perturbations to input samples that cause
large, undesired changes to the output of deep learning models. Despite extensive research …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

The Best Defense is Attack: Repairing Semantics in Textual Adversarial Examples

H Yang, K Li - arXiv preprint arXiv:2305.04067, 2023 - arxiv.org

Recent studies have revealed the vulnerability of pre-trained language models to
adversarial attacks. Existing adversarial defense techniques attempt to reconstruct …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Sample Attackability in Natural Language Adversarial Attacks

V Raina, M Gales - arXiv preprint arXiv:2306.12043, 2023 - arxiv.org

Adversarial attack research in natural language processing (NLP) has made significant
progress in designing powerful attack methods and defence approaches. However, few …

A survey of gradient normalization adversarial attack methods

J Chen, Q Huang, Y Zhang - Proceedings of the 2023 4th International …, 2023 - dl.acm.org

Recent research has found that deep neural networks are vulnerable and easily been
attacked by adversarial samples. Improving the success rate of attacks against adversarial …

高级搜索

QQ 群