Toward Understanding the Importance of Noise in Training Neural Networks M Zhou, T Liu, Y Li, D Lin, E Zhou, T Zhao International Conference on Machine Learning, 7594-7602, 2019 | 87 | 2019 |
Towards understanding the importance of shortcut connections in residual networks T Liu, M Chen, M Zhou, SS Du, E Zhou, T Zhao Advances in Neural Information Processing Systems, 7892-7902, 2019 | 61 | 2019 |
A local convergence theory for mildly over-parameterized two-layer neural network M Zhou, R Ge, C Jin Conference on Learning Theory, 4577-4632, 2021 | 40 | 2021 |
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example X Zhu, Z Wang, X Wang, M Zhou, R Ge International Conference on Learning Representations, 2023 | 35 | 2023 |
Understanding Deflation Process in Over-parametrized Tensor Decomposition R Ge, Y Ren, X Wang, M Zhou Advances in Neural Information Processing Systems 34, 1299-1311, 2021 | 22 | 2021 |
Plateau in Monotonic Linear Interpolation--A" Biased" View of Loss Landscape for Deep Networks X Wang, AN Wang, M Zhou, R Ge International Conference on Learning Representations, 2023 | 6 | 2023 |
Depth Separation with Multilayer Mean-Field Networks Y Ren, M Zhou, R Ge International Conference on Learning Representations, 2023 | 5 | 2023 |
Understanding The Robustness of Self-supervised Learning Through Topic Modeling Z Luo, S Wu, C Weng, M Zhou, R Ge International Conference on Learning Representations, 2023 | 5* | 2023 |
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression M Zhou, R Ge International Conference on Machine Learning, 2023 | 1 | 2023 |
How Does Gradient Descent Learn Features--A Local Analysis for Regularized Two-Layer Neural Networks M Zhou, R Ge arXiv preprint arXiv:2406.01766, 2024 | | 2024 |
Multi-head CLIP: Improving CLIP with diverse representations and flat minima M Zhou, X Zhou, E Li, S Ermon, R Ge NeurIPS 2023 OPT Workshop, 2023 | | 2023 |