Advances in adversarial attacks and defenses in computer vision: A survey

N Akhtar, A Mian, N Kardan, M Shah - IEEE Access, 2021 - ieeexplore.ieee.org
Deep Learning is the most widely used tool in the contemporary field of computer vision. Its
ability to accurately solve complex problems is employed in vision research to learn deep …

A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability

X Huang, D Kroening, W Ruan, J Sharp, Y Sun… - Computer Science …, 2020 - Elsevier
In the past few years, significant progress has been made on deep neural networks (DNNs)
in achieving human-level performance on several long-standing tasks. With the broader …

Certifying llm safety against adversarial prompting

A Kumar, C Agarwal, S Srinivas, AJ Li, S Feizi… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) released for public use incorporate guardrails to ensure
their output is safe, often referred to as" model alignment." An aligned language model …

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

F Croce, M Hein - International conference on machine …, 2020 - proceedings.mlr.press
The field of defense strategies against adversarial attacks has significantly grown over the
last years, but progress is hampered as the evaluation of adversarial defenses is often …

Overfitting in adversarially robust deep learning

L Rice, E Wong, Z Kolter - International conference on …, 2020 - proceedings.mlr.press
It is common practice in deep learning to use overparameterized networks and train for as
long as possible; there are numerous studies that show, both theoretically and empirically …

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Uncovering the limits of adversarial training against norm-bounded adversarial examples

S Gowal, C Qin, J Uesato, T Mann, P Kohli - arXiv preprint arXiv …, 2020 - arxiv.org
Adversarial training and its variants have become de facto standards for learning robust
deep neural networks. In this paper, we explore the landscape around adversarial training in …

Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification

S Wang, H Zhang, K Xu, X Lin, S Jana… - Advances in …, 2021 - proceedings.neurips.cc
Bound propagation based incomplete neural network verifiers such as CROWN are very
efficient and can significantly accelerate branch-and-bound (BaB) based complete …

Rethinking lipschitz neural networks and certified robustness: A boolean function perspective

B Zhang, D Jiang, D He… - Advances in neural …, 2022 - proceedings.neurips.cc
Designing neural networks with bounded Lipschitz constant is a promising way to obtain
certifiably robust classifiers against adversarial examples. However, the relevant progress …

Certified adversarial robustness via randomized smoothing

J Cohen, E Rosenfeld, Z Kolter - international conference on …, 2019 - proceedings.mlr.press
We show how to turn any classifier that classifies well under Gaussian noise into a new
classifier that is certifiably robust to adversarial perturbations under the L2 norm. While this" …