Asam: Adaptive sharpness-aware minimization for scale-invariant learning of deep neural networks

J Kwon, J Kim, H Park, IK Choi - International Conference on …, 2021 - proceedings.mlr.press
Recently, learning algorithms motivated from sharpness of loss surface as an effective
measure of generalization gap have shown state-of-the-art performances. Nevertheless …

Surrogate gap minimization improves sharpness-aware training

J Zhuang, B Gong, L Yuan, Y Cui, H Adam… - arXiv preprint arXiv …, 2022 - arxiv.org
The recently proposed Sharpness-Aware Minimization (SAM) improves generalization by
minimizing a\textit {perturbed loss} defined as the maximum loss within a neighborhood in …

Improving generalization in federated learning by seeking flat minima

D Caldarola, B Caputo, M Ciccone - European Conference on Computer …, 2022 - Springer
Abstract Models trained in federated settings often suffer from degraded performances and
fail at generalizing, especially when facing heterogeneous scenarios. In this work, we …

The internet of federated things (IoFT)

R Kontar, N Shi, X Yue, S Chung, E Byon… - IEEE …, 2021 - ieeexplore.ieee.org
The Internet of Things (IoT) is on the verge of a major paradigm shift. In the IoT system of the
future, IoFT, the “cloud” will be substituted by the “crowd” where model training is brought to …

Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry

F Pittorino, A Ferraro, G Perugini… - International …, 2022 - proceedings.mlr.press
We systematize the approach to the investigation of deep neural network landscapes by
basing it on the geometry of the space of implemented functions rather than the space of …

Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

Y Wang, Z Xu, T Zhao, M Tao - arXiv preprint arXiv:2310.17087, 2023 - arxiv.org
Large learning rates, when applied to gradient descent for nonconvex optimization, yield
various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang …

A survey on scenario theory, complexity, and compression-based learning and generalization

R Rocchetta, A Mey, FA Oliehoek - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
This work investigates formal generalization error bounds that apply to support vector
machines (SVMs) in realizable and agnostic learning problems. We focus on recently …

GA-SAM: Gradient-strength based adaptive sharpness-aware minimization for improved generalization

Z Zhang, R Luo, Q Su, X Sun - arXiv preprint arXiv:2210.06895, 2022 - arxiv.org
Recently, Sharpness-Aware Minimization (SAM) algorithm has shown state-of-the-art
generalization abilities in vision tasks. It demonstrates that flat minima tend to imply better …

Federated condition monitoring signal prediction with improved generalization

S Chung, R Al Kontar - IEEE Transactions on Reliability, 2023 - ieeexplore.ieee.org
Revolutionary advances in Internet of Things technologies have paved the way for a
significant increase in computational resources at edge devices that collect condition …

Wide-minima density hypothesis and the explore-exploit learning rate schedule

N Iyer, V Thejas, N Kwatra, R Ramjee… - Journal of Machine …, 2023 - jmlr.org
Several papers argue that wide minima generalize better than narrow minima. In this paper,
through detailed experiments that not only corroborate the generalization properties of wide …