Gradient descent finds global minima of deep neural networks SS Du, JD Lee, H Li, L Wang, X Zhai International Conference on Machine Learning 2019, 2018 | 1273 | 2018 |
Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks S Arora, SS Du, W Hu, Z Li, R Wang International Conference on Machine Learning 2019, 2019 | 997 | 2019 |
On exact computation with an infinitely wide neural net S Arora, SS Du, W Hu, Z Li, RR Salakhutdinov, R Wang Advances in neural information processing systems 32, 2019 | 938 | 2019 |
Gradient descent provably optimizes over-parameterized neural networks SS Du, X Zhai, B Poczos, A Singh International Conference on Learning Representations 2019, 2018 | 778 | 2018 |
How neural networks extrapolate: From feedforward to graph neural networks K Xu, M Zhang, J Li, SS Du, K Kawarabayashi, S Jegelka International Conference on Learning Representations, 2021 | 316 | 2021 |
Gradient descent can take exponential time to escape saddle points SS Du, C Jin, JD Lee, MI Jordan, A Singh, B Poczos Advances in neural information processing systems 30, 2017 | 286 | 2017 |
On the power of over-parametrization in neural networks with quadratic activation SS Du, JD Lee International Conference on Machine Learning 2018, 2018 | 275 | 2018 |
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels SS Du, K Hou, B Póczos, R Salakhutdinov, R Wang, K Xu Advances in Neural Information Processing Systems 2019, 2019 | 274 | 2019 |
What Can Neural Networks Reason About? K Xu, J Li, M Zhang, SS Du, K Kawarabayashi, S Jegelka International Conference on Learning Representations 2020, 2019 | 265 | 2019 |
Provably efficient RL with rich observations via latent state decoding SS Du, A Krishnamurthy, N Jiang, A Agarwal, M Dudík, J Langford International Conference on Machine Learning 2019, 2019 | 258 | 2019 |
Few-shot learning via learning the representation, provably SS Du, W Hu, SM Kakade, JD Lee, Q Lei International Conference on Learning Representations, 2021 | 255 | 2021 |
Understanding the acceleration phenomenon via high-resolution differential equations B Shi, SS Du, MI Jordan, WJ Su Mathematical Programming, 1-70, 2022 | 247 | 2022 |
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima SS Du, JD Lee, Y Tian, B Poczos, A Singh International Conference on Machine Learning 2018, 2017 | 246 | 2017 |
Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning? SS Du, SM Kakade, R Wang, LF Yang International Conference on Learning Representation 2020, 2019 | 231 | 2019 |
Bilinear classes: A structural framework for provable generalization in rl S Du, S Kakade, J Lee, S Lovett, G Mahajan, W Sun, R Wang International Conference on Machine Learning, 2826-2836, 2021 | 217 | 2021 |
Algorithmic regularization in learning deep homogeneous models: Layers are automatically balanced SS Du, W Hu, JD Lee Advances in neural information processing systems 31, 2018 | 213 | 2018 |
Stochastic variance reduction methods for policy evaluation SS Du, J Chen, L Li, L Xiao, D Zhou International Conference on Machine Learning 2017, 2017 | 196 | 2017 |
Harnessing the power of infinitely wide deep nets on small-data tasks S Arora, SS Du, Z Li, R Salakhutdinov, R Wang, D Yu International Conference on Learning Representations 2020, 2019 | 175 | 2019 |
Optimism in reinforcement learning with generalized linear function approximation Y Wang, R Wang, SS Du, A Krishnamurthy International Conference on Learning Representations, 2021 | 165 | 2021 |
Computationally efficient robust estimation of sparse functionals SS Du, S Balakrishnan, A Singh Conference on Learning Theory, 2017, 2017 | 150* | 2017 |