Efficiency of minimizing compositions of convex functions and smooth maps D Drusvyatskiy, C Paquette Mathematical Programming 178, 503-558, 2019 | 226 | 2019 |
Subgradient methods for sharp weakly convex functions D Davis, D Drusvyatskiy, KJ MacPhee, C Paquette Journal of Optimization Theory and Applications 179, 962-982, 2018 | 111 | 2018 |
The nonsmooth landscape of phase retrieval D Davis, D Drusvyatskiy, C Paquette IMA Journal of Numerical Analysis 40 (4), 2652-2695, 2020 | 103 | 2020 |
A stochastic line search method with expected complexity analysis C Paquette, K Scheinberg SIAM Journal on Optimization 30 (1), 349-376, 2020 | 100 | 2020 |
Catalyst for gradient-based nonconvex optimization C Paquette, H Lin, D Drusvyatskiy, J Mairal, Z Harchaoui International Conference on Artificial Intelligence and Statistics, 613-622, 2018 | 61 | 2018 |
A stochastic line search method with convergence rate analysis C Paquette, K Scheinberg arXiv preprint arXiv:1807.07994, 2018 | 48 | 2018 |
Catalyst acceleration for gradient-based non-convex optimization C Paquette, H Lin, D Drusvyatskiy, J Mairal, Z Harchaoui arXiv preprint arXiv:1703.10993, 2017 | 41 | 2017 |
Sgd in the large: Average-case analysis, asymptotics, and stepsize criticality C Paquette, K Lee, F Pedregosa, E Paquette Conference on Learning Theory, 3548-3626, 2021 | 35 | 2021 |
Halting time is predictable for large models: A universality property and average-case analysis C Paquette, B van Merriënboer, E Paquette, F Pedregosa Foundations of Computational Mathematics 23 (2), 597-673, 2023 | 27 | 2023 |
Variational analysis of spectral functions simplified D Drusvyatskiy, C Kempton arXiv preprint arXiv:1506.05170, 2015 | 24 | 2015 |
Homogenization of SGD in high-dimensions: Exact dynamics and generalization properties C Paquette, E Paquette, B Adlam, J Pennington arXiv preprint arXiv:2205.07069, 2022 | 23 | 2022 |
Dynamics of stochastic momentum methods on large-scale, quadratic models C Paquette, E Paquette Advances in Neural Information Processing Systems 34, 9229-9240, 2021 | 22 | 2021 |
Implicit regularization or implicit conditioning? exact risk trajectories of sgd in high dimensions C Paquette, E Paquette, B Adlam, J Pennington Advances in Neural Information Processing Systems 35, 35984-35999, 2022 | 13 | 2022 |
Trajectory of mini-batch momentum: batch size saturation and convergence in high dimensions K Lee, A Cheng, E Paquette, C Paquette Advances in Neural Information Processing Systems 35, 36944-36957, 2022 | 12 | 2022 |
Hitting the high-dimensional notes: An ode for sgd learning dynamics on glms and multi-index models E Collins-Woodfin, C Paquette, E Paquette, I Seroussi arXiv preprint arXiv:2308.08977, 2023 | 10 | 2023 |
Only tails matter: Average-case universality and robustness in the convex regime L Cunha, G Gidel, F Pedregosa, D Scieur, C Paquette International Conference on Machine Learning, 4474-4491, 2022 | 7 | 2022 |
4+ 3 Phases of Compute-Optimal Neural Scaling Laws E Paquette, C Paquette, L Xiao, J Pennington arXiv preprint arXiv:2405.15074, 2024 | 2 | 2024 |
Potential-based analyses of first-order methods for constrained and composite optimization C Paquette, S Vavasis arXiv preprint arXiv:1903.08497, 2019 | 2 | 2019 |
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems T González, C Guzmán, C Paquette arXiv preprint arXiv:2403.02912, 2024 | 1 | 2024 |
Implicit Diffusion: Efficient Optimization through Stochastic Sampling P Marion, A Korba, P Bartlett, M Blondel, V De Bortoli, A Doucet, ... arXiv preprint arXiv:2402.05468, 2024 | 1 | 2024 |