Universal adversarial attacks on text classifiers

B Alshemali, J Kalita - Knowledge-Based Systems, 2020 - Elsevier

Deep learning models have achieved great success in solving a variety of natural language
processing (NLP) problems. An ever-growing body of research, however, illustrates the …

被引用次数：237 相关文章所有 2 个版本

[PDF] researchgate.net

The future of false information detection on social media: New perspectives and trends

B Guo, Y Ding, L Yao, Y Liang, Z Yu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org

The massive spread of false information on social media has become a global risk, implicitly
influencing public opinion and threatening social/political development. False information …

被引用次数：190 相关文章所有 2 个版本

[PDF] aclanthology.org

Red teaming language models with language models

E Perez, S Huang, F Song, T Cai, R Ring… - arXiv preprint arXiv …, 2022 - arxiv.org

Language Models (LMs) often cannot be deployed because of their potential to harm users
in hard-to-predict ways. Prior work identifies harmful behaviors before deployment by using …

被引用次数：587 相关文章所有 4 个版本

[PDF] arxiv.org

Universal adversarial triggers for attacking and analyzing NLP

E Wallace, S Feng, N Kandpal, M Gardner… - arXiv preprint arXiv …, 2019 - arxiv.org

Adversarial examples highlight model vulnerabilities and are useful for evaluation and
interpretation. We define universal adversarial triggers: input-agnostic sequences of tokens …

被引用次数：930 相关文章所有 6 个版本

[PDF] mit.edu

Red teaming language model detectors with language models

Z Shi, Y Wang, F Yin, X Chen, KW Chang… - Transactions of the …, 2024 - direct.mit.edu

The prevalence and strong capability of large language models (LLMs) present significant
safety and ethical risks if exploited by malicious users. To prevent the potentially deceptive …

被引用次数：63 相关文章所有 7 个版本

[PDF] arxiv.org

Advclip: Downstream-agnostic adversarial examples in multimodal contrastive learning

Z Zhou, S Hu, M Li, H Zhang, Y Zhang… - Proceedings of the 31st …, 2023 - dl.acm.org

Multimodal contrastive learning aims to train a general-purpose feature extractor, such as
CLIP, on vast amounts of raw, unlabeled paired image-text data. This can greatly benefit …

被引用次数：54 相关文章所有 4 个版本

Adversarial attack and defense technologies in natural language processing: A survey

S Qiu, Q Liu, S Zhou, W Huang - Neurocomputing, 2022 - Elsevier

Recently, the adversarial attack and defense technology has made remarkable
achievements and has been widely applied in the computer vision field, promoting its rapid …

被引用次数：84 相关文章所有 2 个版本

[PDF] arxiv.org

A survey on universal adversarial attack

C Zhang, P Benz, C Lin, A Karjauv, J Wu… - arXiv preprint arXiv …, 2021 - arxiv.org

The intriguing phenomenon of adversarial examples has attracted significant attention in
machine learning and what might be more surprising to the community is the existence of …

被引用次数：111 相关文章所有 5 个版本

[PDF] usenix.org

{T-Miner}: A generative approach to defend against trojan attacks on {DNN-based} text classification

A Azizi, IA Tahmid, A Waheed, N Mangaokar… - 30th USENIX Security …, 2021 - usenix.org

Deep Neural Network (DNN) classifiers are known to be vulnerable to Trojan or backdoor
attacks, where the classifier is manipulated such that it misclassifies any input containing an …

被引用次数：108 相关文章所有 10 个版本

[PDF] thecvf.com

Adversarial threats to deepfake detection: A practical perspective

P Neekhara, B Dolhansky, J Bitton… - Proceedings of the …, 2021 - openaccess.thecvf.com

Facially manipulated images and videos or DeepFakes can be used maliciously to fuel
misinformation or defame individuals. Therefore, detecting DeepFakes is crucial to increase …

被引用次数：102 相关文章所有 6 个版本

高级搜索

QQ 群