Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org
Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …
We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks. Gradient Starvation arises when …
Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models …
Y Li, Y Liang - Advances in neural information processing …, 2018 - proceedings.neurips.cc
Neural networks have many successful applications, while much less theoretical understanding has been gained. Towards bridging this gap, we study the problem of …
We lower bound the complexity of finding ϵ-stationary points (with gradient norm at most ϵ) using stochastic first-order methods. In a well-studied model where algorithms access …
Y Chen, Y Chi, J Fan, C Ma - Foundations and Trends® in …, 2021 - nowpublishers.com
Spectral methods have emerged as a simple yet surprisingly effective approach for extracting information from massive, noisy and incomplete data. In a nutshell, spectral …
K Lyu, J Li - arXiv preprint arXiv:1906.05890, 2019 - arxiv.org
In this paper, we study the implicit regularization of the gradient descent algorithm in homogeneous neural networks, including fully-connected and convolutional neural …
E Abbe, J Fan, K Wang, Y Zhong - Annals of statistics, 2020 - ncbi.nlm.nih.gov
Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking …
We develop and analyze MARINA: a new communication efficient method for non-convex distributed learning over heterogeneous datasets. MARINA employs a novel communication …