Recent advances in adversarial training for adversarial robustness

T Bai, J Luo, J Zhao, B Wen, Q Wang - arXiv preprint arXiv:2102.01356, 2021 - arxiv.org
Adversarial training is one of the most effective approaches defending against adversarial
examples for deep learning models. Unlike other defense strategies, adversarial training …

Ai alignment: A comprehensive survey

J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
AI alignment aims to make AI systems behave in line with human intentions and values. As
AI systems grow more capable, the potential large-scale risks associated with misaligned AI …

Better diffusion models further improve adversarial training

Z Wang, T Pang, C Du, M Lin… - … on Machine Learning, 2023 - proceedings.mlr.press
It has been recognized that the data generated by the denoising diffusion probabilistic
model (DDPM) improves adversarial training. After two years of rapid development in …

Cross-entropy loss functions: Theoretical analysis and applications

A Mao, M Mohri, Y Zhong - International conference on …, 2023 - proceedings.mlr.press
Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss
applied to the outputs of a neural network, when the softmax is used. But, what guarantees …

Robustbench: a standardized adversarial robustness benchmark

F Croce, M Andriushchenko, V Sehwag… - arXiv preprint arXiv …, 2020 - arxiv.org
As a research community, we are still lacking a systematic understanding of the progress on
adversarial robustness which often makes it hard to identify the most promising ideas in …

LAS-AT: adversarial training with learnable attack strategy

X Jia, Y Zhang, B Wu, K Ma… - Proceedings of the …, 2022 - openaccess.thecvf.com
Adversarial training (AT) is always formulated as a minimax problem, of which the
performance depends on the inner optimization that involves the generation of adversarial …

On the robustness of vision transformers to adversarial examples

K Mahmood, R Mahmood… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Recent advances in attention-based networks have shown that Vision Transformers can
achieve state-of-the-art or near state-of-the-art results on many image classification tasks …

Robust pre-training by adversarial contrastive learning

Z Jiang, T Chen, T Chen… - Advances in neural …, 2020 - proceedings.neurips.cc
Recent work has shown that, when integrated with adversarial training, self-supervised pre-
training can lead to state-of-the-art robustness In this work, we improve robustness-aware …

Augmax: Adversarial composition of random augmentations for robust training

H Wang, C Xiao, J Kossaifi, Z Yu… - Advances in neural …, 2021 - proceedings.neurips.cc
Data augmentation is a simple yet effective way to improve the robustness of deep neural
networks (DNNs). Diversity and hardness are two complementary dimensions of data …

Exploring architectural ingredients of adversarially robust deep neural networks

H Huang, Y Wang, S Erfani, Q Gu… - Advances in Neural …, 2021 - proceedings.neurips.cc
Deep neural networks (DNNs) are known to be vulnerable to adversarial attacks. A range of
defense methods have been proposed to train adversarially robust DNNs, among which …