The trojai software framework: An opensource tool for embedding trojans into deep learning models

Y Gao, BG Doan, Z Zhang, S Ma, J Zhang, A Fu… - arXiv preprint arXiv …, 2020 - arxiv.org

This work provides the community with a timely comprehensive review of backdoor attacks
and countermeasures on deep learning. According to the attacker's capability and affected …

被引用次数：255 相关文章所有 3 个版本

[PDF] arxiv.org

Unsolved problems in ml safety

D Hendrycks, N Carlini, J Schulman… - arXiv preprint arXiv …, 2021 - arxiv.org

Machine learning (ML) systems are rapidly increasing in size, are acquiring new
capabilities, and are increasingly deployed in high-stakes settings. As with other powerful …

被引用次数：338 相关文章所有 6 个版本

[PDF] neurips.cc

Backdoorbench: A comprehensive benchmark of backdoor learning

B Wu, H Chen, M Zhang, Z Zhu, S Wei… - Advances in …, 2022 - proceedings.neurips.cc

Backdoor learning is an emerging and vital topic for studying deep neural networks'
vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being …

被引用次数：135 相关文章所有 6 个版本

[PDF] arxiv.org

Backdoor learning for nlp: Recent advances, challenges, and future research directions

M Omar - arXiv preprint arXiv:2302.06801, 2023 - arxiv.org

Although backdoor learning is an active research topic in the NLP domain, the literature
lacks studies that systematically categorize and summarize backdoor attacks and defenses …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

Sleeper agents: Training deceptive llms that persist through safety training

E Hubinger, C Denison, J Mu, M Lambert… - arXiv preprint arXiv …, 2024 - arxiv.org

Humans are capable of strategically deceptive behavior: behaving helpfully in most
situations, but then behaving very differently in order to pursue alternative objectives when …

被引用次数：68 相关文章所有 2 个版本

[PDF] neurips.cc

Training with more confidence: Mitigating injected and natural backdoors during training

Z Wang, H Ding, J Zhai, S Ma - Advances in Neural …, 2022 - proceedings.neurips.cc

The backdoor or Trojan attack is a severe threat to deep neural networks (DNNs).
Researchers find that DNNs trained on benign data and settings can also learn backdoor …

被引用次数：43 相关文章所有 7 个版本

[PDF] thecvf.com

Dual-key multimodal backdoors for visual question answering

M Walmer, K Sikka, I Sur… - Proceedings of the …, 2022 - openaccess.thecvf.com

The success of deep learning has enabled advances in multimodal tasks that require non-
trivial fusion of multiple input domains. Although multimodal models have shown potential in …

被引用次数：47 相关文章所有 6 个版本

[PDF] neurips.cc

Provable defense against backdoor policies in reinforcement learning

S Bharti, X Zhang, A Singla… - Advances in Neural …, 2022 - proceedings.neurips.cc

We propose a provable defense mechanism against backdoor policies in reinforcement
learning under subspace trigger assumption. A backdoor policy is a security threat where an …

被引用次数：21 相关文章所有 9 个版本

[PDF] thecvf.com

Trojan signatures in DNN weights

G Fields, M Samragh, M Javaheripi… - Proceedings of the …, 2021 - openaccess.thecvf.com

Deep neural networks have been shown to be vulnerable to backdoor, or Trojan, attacks
where an adversary has embedded a trigger in the network at training time such that the …

被引用次数：30 相关文章所有 7 个版本

[PDF] ieee.org

Trojan attack and defense for deep learning based navigation systems of unmanned aerial vehicles

M Mynuddin, SU Khan, R Ahmari, L Landivar… - IEEE …, 2024 - ieeexplore.ieee.org

As unmanned aerial vehicles (UAVs) become increasingly integrated across various
domains, both military and civilian, safeguarding the security of their navigation systems …

被引用次数：4 相关文章所有 2 个版本

高级搜索

QQ 群