Gradient descent finds global minima of deep neural networks SS Du, JD Lee, H Li, L Wang, X Zhai arXiv preprint arXiv:1811.03804, 2018 | 1284 | 2018 |
Exact post-selection inference, with application to the lasso JD Lee, DL Sun, Y Sun, JE Taylor The Annals of Statistics 44 (3), 907-927, 2016 | 939 | 2016 |
Gradient descent only converges to minimizers JD Lee, M Simchowitz, MI Jordan, B Recht Conference on learning theory, 1246-1257, 2016 | 863* | 2016 |
On the theory of policy gradient methods: Optimality, approximation, and distribution shift A Agarwal, SM Kakade, JD Lee, G Mahajan Journal of Machine Learning Research 22 (98), 1-76, 2021 | 737* | 2021 |
Matrix completion has no spurious local minimum R Ge, JD Lee, T Ma Advances in neural information processing systems 29, 2016 | 700 | 2016 |
Matrix completion and low-rank SVD via fast alternating least squares T Hastie, R Mazumder, J Lee, R Zadeh Journal of Machine Learning Research, 2014 | 606 | 2014 |
A kernelized Stein discrepancy for goodness-of-fit tests Q Liu, J Lee, M Jordan International conference on machine learning, 276-284, 2016 | 510 | 2016 |
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks M Soltanolkotabi, A Javanmard, JD Lee IEEE Transactions on Information Theory 65 (2), 742-769, 2018 | 460 | 2018 |
Proximal Newton-type methods for minimizing composite functions JD Lee, Y Sun, MA Saunders SIAM Journal on Optimization 24 (3), 1420-1443, 2014 | 449* | 2014 |
Communication-efficient distributed statistical inference MI Jordan, JD Lee, Y Yang Journal of the American Statistical Association, 2018 | 442 | 2018 |
Implicit bias of gradient descent on linear convolutional networks S Gunasekar, JD Lee, D Soudry, N Srebro Advances in neural information processing systems 31, 2018 | 436 | 2018 |
Characterizing implicit bias in terms of optimization geometry S Gunasekar, J Lee, D Soudry, N Srebro International Conference on Machine Learning, 1832-1841, 2018 | 433 | 2018 |
Solving a class of non-convex min-max games using iterative first order methods M Nouiehed, M Sanjabi, T Huang, JD Lee, M Razaviyayn Advances in Neural Information Processing Systems 32, 2019 | 358 | 2019 |
Kernel and rich regimes in overparametrized models B Woodworth, S Gunasekar, JD Lee, E Moroshko, P Savarese, I Golan, ... Conference on Learning Theory, 3635-3673, 2020 | 350 | 2020 |
First-order methods almost always avoid strict saddle points JD Lee, I Panageas, G Piliouras, M Simchowitz, MI Jordan, B Recht Mathematical programming 176 (1-2), 311-337, 2019 | 350 | 2019 |
Learning one-hidden-layer neural networks with landscape design R Ge, JD Lee, T Ma arXiv preprint arXiv:1711.00501, 2017 | 298 | 2017 |
Gradient descent can take exponential time to escape saddle points SS Du, C Jin, JD Lee, MI Jordan, A Singh, B Poczos Advances in neural information processing systems 30, 2017 | 286 | 2017 |
On the power of over-parametrization in neural networks with quadratic activation S Du, J Lee International conference on machine learning, 1329-1338, 2018 | 278 | 2018 |
Few-shot learning via learning the representation, provably SS Du, W Hu, SM Kakade, JD Lee, Q Lei arXiv preprint arXiv:2002.09434, 2020 | 257 | 2020 |
Stochastic subgradient method converges on tame functions D Davis, D Drusvyatskiy, S Kakade, JD Lee Foundations of computational mathematics 20 (1), 119-154, 2020 | 256 | 2020 |