Sok: Certified robustness for deep neural networks

A Zou, Z Wang, JZ Kolter, M Fredrikson - arXiv preprint arXiv:2307.15043, 2023 - arxiv.org

Because" out-of-the-box" large language models are capable of generating a great deal of
objectionable content, recent work has focused on aligning these models in an attempt to …

被引用次数：460 相关文章所有 8 个版本

[PDF] aaai.org

Visual adversarial examples jailbreak aligned large language models

X Qi, K Huang, A Panda, P Henderson… - Proceedings of the …, 2024 - ojs.aaai.org

Warning: this paper contains data, prompts, and model outputs that are offensive in nature.
Recently, there has been a surge of interest in integrating vision into Large Language …

被引用次数：54 相关文章所有 2 个版本

[PDF] arxiv.org

Invisible for both camera and lidar: Security of multi-sensor fusion based perception in autonomous driving under physical-world attacks

Y Cao, N Wang, C Xiao, D Yang, J Fang… - … IEEE symposium on …, 2021 - ieeexplore.ieee.org

In Autonomous Driving (AD) systems, perception is both security and safety critical. Despite
various prior studies on its security issues, all of them only consider attacks on camera-or …

被引用次数：190 相关文章所有 10 个版本

[PDF] arxiv.org

Rab: Provable robustness against backdoor attacks

M Weber, X Xu, B Karlaš, C Zhang… - 2023 IEEE Symposium …, 2023 - ieeexplore.ieee.org

Recent studies have shown that deep neural net-works (DNNs) are vulnerable to
adversarial attacks, including evasion and backdoor (poisoning) attacks. On the defense …

被引用次数：142 相关文章所有 8 个版本

[PDF] neurips.cc

Lot: Layer-wise orthogonal training on improving l2 certified robustness

X Xu, L Li, B Li - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc

Recent studies show that training deep neural networks (DNNs) with Lipschitz constraints
are able to enhance adversarial robustness and other model properties such as stability. In …

被引用次数：25 相关文章所有 5 个版本

[PDF] arxiv.org

Visual adversarial examples jailbreak large language models

X Qi, K Huang, A Panda, M Wang, P Mittal - arXiv preprint arXiv …, 2023 - arxiv.org

Recently, there has been a surge of interest in introducing vision into Large Language
Models (LLMs). The proliferation of large Visual Language Models (VLMs), such as …

被引用次数：35 相关文章所有 2 个版本

[PDF] neurips.cc

Smoothmix: Training confidence-calibrated smoothed classifiers for certified robustness

J Jeong, S Park, M Kim, HC Lee… - Advances in Neural …, 2021 - proceedings.neurips.cc

Randomized smoothing is currently a state-of-the-art method to construct a certifiably robust
classifier from neural networks against $\ell_2 $-adversarial perturbations. Under the …

被引用次数：47 相关文章所有 8 个版本

[PDF] usenix.org

{DiffSmooth}: Certifiably robust learning via diffusion models and local smoothing

J Zhang, Z Chen, H Zhang, C Xiao, B Li - 32nd USENIX Security …, 2023 - usenix.org

Diffusion models have been leveraged to perform adversarial purification and thus provide
both empirical and certified robustness for a standard model. On the other hand, different …

被引用次数：10 相关文章所有 6 个版本

[PDF] neurips.cc

Trs: Transferability reduced ensemble via promoting gradient diversity and model smoothness

Z Yang, L Li, X Xu, S Zuo, Q Chen… - Advances in …, 2021 - proceedings.neurips.cc

Adversarial Transferability is an intriguing property-adversarial perturbation crafted against
one model is also effective against another model, while these models are from different …

被引用次数：55 相关文章所有 8 个版本

[PDF] arxiv.org

Crop: Certifying robust policies for reinforcement learning through functional smoothing

F Wu, L Li, Z Huang, Y Vorobeychik, D Zhao… - arXiv preprint arXiv …, 2021 - arxiv.org

As reinforcement learning (RL) has achieved great success and been even adopted in
safety-critical domains such as autonomous vehicles, a range of empirical studies have …

被引用次数：52 相关文章所有 8 个版本

高级搜索

QQ 群