Recent theoretical advances in non-convex optimization

Z Lu, I Afridi, HJ Kang, I Ruchkin, X Zheng - Journal of Reliable Intelligent …, 2024 - Springer

Abstract The integration of Artificial Intelligence (AI) with the Internet of Things (IoT), known
as the Artificial Intelligence of Things (AIoT), enhances the devices' processing and analysis …

被引用次数：11 相关文章所有 3 个版本

[PDF] neurips.cc

A geometric analysis of neural collapse with unconstrained features

Z Zhu, T Ding, J Zhou, X Li, C You… - Advances in Neural …, 2021 - proceedings.neurips.cc

We provide the first global optimization landscape analysis of Neural Collapse--an intriguing
empirical phenomenon that arises in the last-layer classifiers and features of neural …

被引用次数：198 相关文章所有 10 个版本

[PDF] mlr.press

MARINA: Faster non-convex distributed learning with compression

E Gorbunov, KP Burlachenko, Z Li… - … on Machine Learning, 2021 - proceedings.mlr.press

We develop and analyze MARINA: a new communication efficient method for non-convex
distributed learning over heterogeneous datasets. MARINA employs a novel communication …

被引用次数：124 相关文章所有 12 个版本

[PDF] mlr.press

The complexity of nonconvex-strongly-concave minimax optimization

S Zhang, J Yang, C Guzmán… - Uncertainty in …, 2021 - proceedings.mlr.press

This paper studies the complexity for finding approximate stationary points of nonconvex-
strongly-concave (NC-SC) smooth minimax problems, in both general and averaged smooth …

被引用次数：78 相关文章所有 12 个版本

[PDF] arxiv.org

Federated learning with sparsified model perturbation: Improving accuracy under client-level differential privacy

R Hu, Y Guo, Y Gong - IEEE Transactions on Mobile Computing, 2023 - ieeexplore.ieee.org

Federated learning (FL) that enables edge devices to collaboratively learn a shared model
while keeping their training data locally has received great attention recently and can protect …

被引用次数：73 相关文章所有 4 个版本

[PDF] acm.org

A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and Healthcare

Y Zheng, Q Hao, J Wang, C Gao, J Chen, D Jin… - ACM Computing …, 2024 - dl.acm.org

Developing smart cities is vital for ensuring sustainable development and improving human
well-being. One critical aspect of building smart cities is designing intelligent methods to …

From symmetry to geometry: Tractable nonconvex problems

Y Zhang, Q Qu, J Wright - arXiv preprint arXiv:2007.06753, 2020 - arxiv.org

As science and engineering have become increasingly data-driven, the role of optimization
has expanded to touch almost every stage of the data analysis pipeline, from signal and …

被引用次数：62 相关文章所有 5 个版本

[PDF] arxiv.org

Toward understanding why adam converges faster than sgd for transformers

Y Pan, Y Li - arXiv preprint arXiv:2306.00204, 2023 - arxiv.org

While stochastic gradient descent (SGD) is still the most popular optimization algorithm in
deep learning, adaptive algorithms such as Adam have established empirical advantages …

被引用次数：39 相关文章所有 5 个版本

[PDF] arxiv.org

Convergence of gradient descent for deep neural networks

S Chatterjee - arXiv preprint arXiv:2203.16462, 2022 - arxiv.org

This article presents a criterion for convergence of gradient descent to a global minimum,
which is then used to show that gradient descent with proper initialization converges to a …

被引用次数：28 相关文章所有 2 个版本

[PDF] arxiv.org

DASHA: Distributed nonconvex optimization with communication compression, optimal oracle complexity, and no client synchronization

A Tyurin, P Richtárik - arXiv preprint arXiv:2202.01268, 2022 - arxiv.org

We develop and analyze DASHA: a new family of methods for nonconvex distributed
optimization problems. When the local functions at the nodes have a finite-sum or an …

被引用次数：28 相关文章所有 5 个版本

高级搜索

QQ 群