Backdoor pre-trained models can transfer to all

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer

Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

被引用次数：57 相关文章所有 6 个版本

[PDF] arxiv.org

Badchain: Backdoor chain-of-thought prompting for large language models

Z Xiang, F Jiang, Z Xiong, B Ramasubramanian… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) are shown to benefit from chain-of-thought (COT) prompting,
particularly when tackling tasks that require systematic reasoning processes. On the other …

被引用次数：21 相关文章所有 5 个版本

[PDF] neurips.cc

Parafuzz: An interpretability-driven technique for detecting poisoned samples in nlp

L Yan, Z Zhang, G Tao, K Zhang… - Advances in …, 2024 - proceedings.neurips.cc

Backdoor attacks have emerged as a prominent threat to natural language processing (NLP)
models, where the presence of specific triggers in the input can lead poisoned models to …

被引用次数：4 相关文章所有 7 个版本

[PDF] arxiv.org

Defending against weight-poisoning backdoor attacks for parameter-efficient fine-tuning

S Zhao, L Gan, LA Tuan, J Fu, L Lyu, M Jia… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently, various parameter-efficient fine-tuning (PEFT) strategies for application to
language models have been proposed and successfully implemented. However, this raises …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Test-time backdoor attacks on multimodal large language models

D Lu, T Pang, C Du, Q Liu, X Yang, M Lin - arXiv preprint arXiv …, 2024 - arxiv.org

Backdoor attacks are commonly executed by contaminating training data, such that a trigger
can activate predetermined harmful effects during the test phase. In this work, we present …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Watch out for your agents! investigating backdoor threats to llm-based agents

W Yang, X Bi, Y Lin, S Chen, J Zhou, X Sun - arXiv preprint arXiv …, 2024 - arxiv.org

Leveraging the rapid development of Large Language Models LLMs, LLM-based agents
have been developed to handle various real-world applications, including finance …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Position paper: Assessing robustness, privacy, and fairness in federated learning integrated with foundation models

X Li, J Wang - arXiv preprint arXiv:2402.01857, 2024 - arxiv.org

Federated Learning (FL), while a breakthrough in decentralized machine learning, contends
with significant challenges such as limited data availability and the variability of …

被引用次数：3 相关文章所有 4 个版本

[PDF] arxiv.org

Learning to poison large language models during instruction tuning

Y Qiang, X Zhou, SZ Zade, MA Roshani, D Zytko… - arXiv preprint arXiv …, 2024 - arxiv.org

The advent of Large Language Models (LLMs) has marked significant achievements in
language processing and reasoning capabilities. Despite their advancements, LLMs face …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Transtroj: Transferable backdoor attacks to pre-trained models via embedding indistinguishability

H Wang, T Xiang, S Guo, J He, H Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Pre-trained models (PTMs) are extensively utilized in various downstream tasks. Adopting
untrusted PTMs may suffer from backdoor attacks, where the adversary can compromise the …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Synergizing Foundation Models and Federated Learning: A Survey

S Li, F Ye, M Fang, J Zhao, YH Chan, ECH Ngai… - arXiv preprint arXiv …, 2024 - arxiv.org

The recent development of Foundation Models (FMs), represented by large language
models, vision transformers, and multimodal models, has been making a significant impact …

高级搜索

QQ 群