Backdoor pre-trained models can transfer to all

S Guo, C Xie, J Li, L Lyu, T Zhang - arXiv preprint arXiv:2202.06862, 2022 - arxiv.org

Pre-trained language models (PTLMs) have achieved great success and remarkable
performance over a wide range of natural language processing (NLP) tasks. However, there …

被引用次数：33 相关文章所有 2 个版本

[PDF] springer.com

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer

Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

被引用次数：57 相关文章所有 6 个版本

[PDF] openreview.net

Revisiting the assumption of latent separability for backdoor defenses

X Qi, T Xie, Y Li, S Mahloujifar… - The eleventh international …, 2023 - openreview.net

Recent studies revealed that deep learning is susceptible to backdoor poisoning attacks. An
adversary can embed a hidden backdoor into a model to manipulate its predictions by only …

被引用次数：70 相关文章

[PDF] thecvf.com

Detecting backdoors in pre-trained encoders

S Feng, G Tao, S Cheng, G Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Self-supervised learning in computer vision trains on unlabeled data, such as images or
(image, text) pairs, to obtain an image encoder that learns high-quality embeddings for input …

被引用次数：28 相关文章所有 8 个版本

[PDF] neurips.cc

A unified evaluation of textual backdoor learning: Frameworks and benchmarks

G Cui, L Yuan, B He, Y Chen… - Advances in Neural …, 2022 - proceedings.neurips.cc

Textual backdoor attacks are a kind of practical threat to NLP systems. By injecting a
backdoor in the training phase, the adversary could control model predictions via predefined …

被引用次数：55 相关文章所有 7 个版本

Backdoors against natural language processing: A review

S Li, T Dong, BZH Zhao, M Xue, S Du… - IEEE Security & …, 2022 - ieeexplore.ieee.org

Backdoors Against Natural Language Processing: A Review Page 1 50 September/October
2022 Copublished by the IEEE Computer and Reliability Societies 1540-7993/22©2022IEEE …

被引用次数：20 相关文章所有 6 个版本

[PDF] nsf.gov

Piccolo: Exposing complex backdoors in nlp transformer models

Y Liu, G Shen, G Tao, S An, S Ma… - 2022 IEEE Symposium …, 2022 - ieeexplore.ieee.org

Backdoors can be injected to NLP models such that they misbehave when the trigger words
or sentences appear in an input sample. Detecting such backdoors given only a subject …

被引用次数：57 相关文章所有 6 个版本

[PDF] arxiv.org

Notable: Transferable backdoor attacks against prompt-based nlp models

K Mei, Z Li, Z Wang, Y Zhang, S Ma - arXiv preprint arXiv:2305.17826, 2023 - arxiv.org

Prompt-based learning is vulnerable to backdoor attacks. Existing backdoor attacks against
prompt-based models consider injecting backdoors into the entire embedding layers or word …

被引用次数：29 相关文章所有 8 个版本

[PDF] aaai.org

Plmmark: a secure and robust black-box watermarking framework for pre-trained language models

P Li, P Cheng, F Li, W Du, H Zhao, G Liu - Proceedings of the AAAI …, 2023 - ojs.aaai.org

The huge training overhead, considerable commercial value, and various potential security
risks make it urgent to protect the intellectual property (IP) of Deep Neural Networks (DNNs) …

被引用次数：22 相关文章所有 2 个版本

[PDF] thecvf.com

Towards practical deployment-stage backdoor attack on deep neural networks

X Qi, T Xie, R Pan, J Zhu, Y Yang… - Proceedings of the …, 2022 - openaccess.thecvf.com

One major goal of the AI security community is to securely and reliably produce and deploy
deep learning models for real-world applications. To this end, data poisoning based …

被引用次数：54 相关文章所有 5 个版本

高级搜索

QQ 群