In the past few years, significant progress has been made on deep neural networks (DNNs) in achieving human-level performance on several long-standing tasks. With the broader …
Large language models (LLMs) released for public use incorporate guardrails to ensure their output is safe, often referred to as" model alignment." An aligned language model …
F Croce, M Hein - International conference on machine …, 2020 - proceedings.mlr.press
The field of defense strategies against adversarial attacks has significantly grown over the last years, but progress is hampered as the evaluation of adversarial defenses is often …
L Rice, E Wong, Z Kolter - International conference on …, 2020 - proceedings.mlr.press
It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically …
Large language models (LLMs) have exploded a new heatwave of AI for their ability to engage end-users in human-level conversations with detailed and articulate answers across …
Adversarial training and its variants have become de facto standards for learning robust deep neural networks. In this paper, we explore the landscape around adversarial training in …
Bound propagation based incomplete neural network verifiers such as CROWN are very efficient and can significantly accelerate branch-and-bound (BaB) based complete …
B Zhang, D Jiang, D He… - Advances in neural …, 2022 - proceedings.neurips.cc
Designing neural networks with bounded Lipschitz constant is a promising way to obtain certifiably robust classifiers against adversarial examples. However, the relevant progress …
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the L2 norm. While this" …