related:qwAbNLxRgsAJ:scholar.google.com/

Backdoor pre-trained models can transfer to all

L Shen, S Ji, X Zhang, J Li, J Chen, J Shi… - arXiv preprint arXiv …, 2021 - arxiv.org

Pre-trained general-purpose language models have been a dominating component in
enabling real-world natural language processing (NLP) applications. However, a pre-trained …

被引用次数：96 相关文章所有 9 个版本

[PDF] arxiv.org

Badpre: Task-agnostic backdoor attacks to pre-trained nlp foundation models

K Chen, Y Meng, X Sun, S Guo, T Zhang, J Li… - arXiv preprint arXiv …, 2021 - arxiv.org

Pre-trained Natural Language Processing (NLP) models can be easily adapted to a variety
of downstream language tasks. This significantly accelerates the development of language …

被引用次数：83 相关文章所有 6 个版本

[PDF] neurips.cc

Moderate-fitting as a natural backdoor defender for pre-trained language models

B Zhu, Y Qin, G Cui, Y Chen, W Zhao… - Advances in …, 2022 - proceedings.neurips.cc

Despite the great success of pre-trained language models (PLMs) in a large set of natural
language processing (NLP) tasks, there has been a growing concern about their security in …

被引用次数：17 相关文章所有 3 个版本

[PDF] neurips.cc

Setting the trap: Capturing and defeating backdoors in pretrained language models through honeypots

RR Tang, J Yuan, Y Li, Z Liu… - Advances in Neural …, 2023 - proceedings.neurips.cc

In the field of natural language processing, the prevalent approach involves fine-tuning
pretrained language models (PLMs) using local samples. Recent research has exposed the …

被引用次数：5 相关文章所有 5 个版本

[PDF] arxiv.org

Training-free lexical backdoor attacks on language models

Y Huang, TY Zhuo, Q Xu, H Hu, X Yuan… - Proceedings of the ACM …, 2023 - dl.acm.org

Large-scale language models have achieved tremendous success across various natural
language processing (NLP) applications. Nevertheless, language models are vulnerable to …

被引用次数：21 相关文章所有 5 个版本

[PDF] aclanthology.org

Rethinking stealthiness of backdoor attack against nlp models

W Yang, Y Lin, P Li, J Zhou, X Sun - … of the 59th Annual Meeting of …, 2021 - aclanthology.org

Recent researches have shown that large natural language processing (NLP) models are
vulnerable to a kind of security threat called the Backdoor Attack. Backdoor attacked models …

被引用次数：87 相关文章所有 2 个版本

[PDF] arxiv.org

Turn the combination lock: Learnable textual backdoor attacks via word substitution

F Qi, Y Yao, S Xu, Z Liu, M Sun - arXiv preprint arXiv:2106.06361, 2021 - arxiv.org

Recent studies show that neural natural language processing (NLP) models are vulnerable
to backdoor attacks. Injected with backdoors, models perform normally on benign examples …

被引用次数：106 相关文章所有 4 个版本

[PDF] arxiv.org

Hidden backdoors in human-centric language models

S Li, H Liu, T Dong, BZH Zhao, M Xue, H Zhu… - Proceedings of the 2021 …, 2021 - dl.acm.org

Natural language processing (NLP) systems have been proven to be vulnerable to backdoor
attacks, whereby hidden features (backdoors) are trained into a language model and may …

被引用次数：119 相关文章所有 6 个版本

[PDF] arxiv.org

Threats to pre-trained language models: Survey and taxonomy

S Guo, C Xie, J Li, L Lyu, T Zhang - arXiv preprint arXiv:2202.06862, 2022 - arxiv.org

Pre-trained language models (PTLMs) have achieved great success and remarkable
performance over a wide range of natural language processing (NLP) tasks. However, there …

被引用次数：33 相关文章所有 2 个版本

[PDF] arxiv.org

Badchain: Backdoor chain-of-thought prompting for large language models

Z Xiang, F Jiang, Z Xiong, B Ramasubramanian… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) are shown to benefit from chain-of-thought (COT) prompting,
particularly when tackling tasks that require systematic reasoning processes. On the other …

被引用次数：21 相关文章所有 5 个版本

高级搜索

QQ 群

Backdoor pre-trained models can transfer to all

Badpre: Task-agnostic backdoor attacks to pre-trained nlp foundation models

Moderate-fitting as a natural backdoor defender for pre-trained language models

Setting the trap: Capturing and defeating backdoors in pretrained language models through honeypots

Training-free lexical backdoor attacks on language models

Rethinking stealthiness of backdoor attack against nlp models

Turn the combination lock: Learnable textual backdoor attacks via word substitution

Hidden backdoors in human-centric language models

Threats to pre-trained language models: Survey and taxonomy

Badchain: Backdoor chain-of-thought prompting for large language models

引用