Random scaling and momentum for non-smooth non-convex optimization

文章

学术资源搜索

获得 3 条结果（用时0.02秒）

我的图书馆

Random scaling and momentum for non-smooth non-convex optimization

在引用文章中搜索

[PDF] arxiv.org

Understanding Adam optimizer via online learning of updates: Adam is FTRL in disguise

K Ahn, Z Zhang, Y Kook, Y Dai - arXiv preprint arXiv:2402.01567, 2024 - arxiv.org

Despite the success of the Adam optimizer in practice, the theoretical understanding of its
algorithmic components still remains limited. In particular, most existing analyses of Adam …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization

K Ahn, G Magakyan, A Cutkosky - arXiv preprint arXiv:2411.07061, 2024 - arxiv.org

This work investigates the effectiveness of schedule-free methods, developed by A. Defazio
et al.(NeurIPS 2024), in nonconvex optimization settings, inspired by their remarkable …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Adam with model exponential moving average is effective for nonconvex optimization

K Ahn, A Cutkosky - arXiv preprint arXiv:2405.18199, 2024 - arxiv.org

In this work, we offer a theoretical analysis of two modern optimization techniques for
training large and complex models:(i) adaptive optimization algorithms, such as Adam, and …

被引用次数：2 相关文章所有 2 个版本

高级搜索

QQ 群

Random scaling and momentum for non-smooth non-convex optimization

Understanding Adam optimizer via online learning of updates: Adam is FTRL in disguise

General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimization

Adam with model exponential moving average is effective for nonconvex optimization

引用