An sde for modeling sam: Theory and insights

EM Compagnoni, L Biggio, A Orvieto… - International …, 2023 - proceedings.mlr.press
We study the SAM (Sharpness-Aware Minimization) optimizer which has recently attracted a
lot of interest due to its increased performance over more classical variants of stochastic …

Decentralized SGD and average-direction SAM are asymptotically equivalent

T Zhu, F He, K Chen, M Song… - … Conference on Machine …, 2023 - proceedings.mlr.press
Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on
massive devices simultaneously without the control of a central server. However, existing …

Fantastic robustness measures: the secrets of robust generalization

H Kim, J Park, Y Choi, J Lee - Advances in Neural …, 2024 - proceedings.neurips.cc
Adversarial training has become the de-facto standard method for improving the robustness
of models against adversarial examples. However, robust overfitting remains a significant …

The crucial role of normalization in sharpness-aware minimization

Y Dai, K Ahn, S Sra - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Abstract Sharpness-Aware Minimization (SAM) is a recently proposed gradient-based
optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep …

Fast sharpness-aware training for periodic time series classification and forecasting

J Park, H Kim, Y Choi, W Lee, J Lee - Applied Soft Computing, 2023 - Elsevier
Various deep learning architectures have been developed to capture long-term
dependencies in time series data, but challenges such as overfitting and computational time …

Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy

C Tan, J Zhang, J Liu, Y Wang, Y Hao - arXiv preprint arXiv:2401.07250, 2024 - arxiv.org
Recently, sharpness-aware minimization (SAM) has attracted a lot of attention because of its
surprising effectiveness in improving generalization performance. However, training neural …

Improving Sharpness-Aware Minimization by Lookahead

R Yu, Y Zhang, J Kwok - Forty-first International Conference on Machine … - openreview.net
Sharpness-Aware Minimization (SAM), which performs gradient descent on adversarially
perturbed weights, can improve generalization by identifying flatter minima. However, recent …

Lookahead Sharpness-Aware Minimization

R Yu, Y Zhang, J Kwok - openreview.net
Sharpness-Aware Minimization (SAM), which performs gradient descent on adversarially
perturbed weights, can improve generalization by identifying flatter minima. However, recent …