K-sam: Sharpness-aware minimization at the speed of sgd

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arXiv preprint arXiv …, 2023 - arxiv.org

The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

被引用次数：32 相关文章所有 2 个版本

Swissconsortium

On Efficient Training of Large-Scale Deep Learning Models

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - ACM Computing Surveys, 2024 - dl.acm.org

The field of deep learning has witnessed significant progress in recent times, particularly in
areas such as computer vision (CV), natural language processing (NLP), and speech. The …

[PDF] arxiv.org

Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy

C Tan, J Zhang, J Liu, Y Wang, Y Hao - arXiv preprint arXiv:2401.07250, 2024 - arxiv.org

Recently, sharpness-aware minimization (SAM) has attracted a lot of attention because of its
surprising effectiveness in improving generalization performance. However, training neural …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Improving SAM requires rethinking its optimization formulation

W Xie, F Latorre, K Antonakopoulos, T Pethick… - arXiv preprint arXiv …, 2024 - arxiv.org

This paper rethinks Sharpness-Aware Minimization (SAM), which is originally formulated as
a zero-sum game where the weights of a network and a bounded perturbation try to …

被引用次数：1 相关文章所有 2 个版本

Swissconsortium

Comprehensive survey on the effectiveness of sharpness aware minimization and its progressive variants

J Rostand, CCJ Hsu, CK Lu - Journal of the Chinese Institute of …, 2024 - Taylor & Francis

As advancements push for larger and more complex Artificial Intelligence (AI) models to
improve performance, preventing the occurrence of overfitting when training …

[PDF] arxiv.org

高级搜索

QQ 群