Painless stochastic gradient: Interpolation, line-search, and convergence rates S Vaswani, A Mishkin, I Laradji, M Schmidt, G Gidel, S Lacoste-Julien Advances in neural information processing systems 32, 2019 | 213 | 2019 |
Slang: Fast structured covariance approximations for bayesian deep learning with natural gradient A Mishkin, F Kunstner, D Nielsen, M Schmidt, ME Khan Advances in neural information processing systems 31, 2018 | 69 | 2018 |
Fast convex optimization for two-layer relu networks: Equivalent model classes and cone decompositions A Mishkin, A Sahiner, M Pilanci International Conference on Machine Learning, 15770-15816, 2022 | 27 | 2022 |
Interpolation, Growth Conditions, and Stochastic Gradient Descent A Mishkin University of British Columbia, 2020 | 6 | 2020 |
To each optimizer a norm, to each norm its generalization S Vaswani, R Babanezhad, J Gallego-Posada, A Mishkin, ... arXiv preprint arXiv:2006.06821, 2020 | 5 | 2020 |
Optimal sets and solution paths of ReLU networks A Mishkin, M Pilanci International Conference on Machine Learning, 24888-24924, 2023 | 3 | 2023 |
Web ValueCharts AP Mishkin, EA Hindalong | 3 | 2018 |
Directional Smoothness and Gradient Methods: Convergence and Adaptivity A Mishkin, A Khaled, Y Wang, A Defazio, RM Gower arXiv preprint arXiv:2403.04081, 2024 | 2 | 2024 |
Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm AV Ramesh, A Mishkin, M Schmidt, Y Zhou, JW Lavington, J She arXiv preprint arXiv:2307.01169, 2023 | 1 | 2023 |
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation A Mishkin, M Pilanci, M Schmidt arXiv preprint arXiv:2404.02378, 2024 | | 2024 |
Level Set Teleportation: An Optimization Perspective A Mishkin, A Bietti, RM Gower arXiv preprint arXiv:2403.03362, 2024 | | 2024 |
A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features E Zeger, Y Wang, A Mishkin, T Ergen, E Candès, M Pilanci arXiv preprint arXiv:2403.01046, 2024 | | 2024 |
Web ValueCharts: Analyzing Individual and Group Preferences with Interactive, Web-based Visualizations A Mishkin | | 2017 |
A novel analysis of gradient descent under directional smoothness A Mishkin, A Khaled, A Defazio, RM Gower OPT 2023: Optimization for Machine Learning, 0 | | |
Level Set Teleportation: the Good, the Bad, and the Ugly A Mishkin, A Bietti, RM Gower OPT 2023: Optimization for Machine Learning, 0 | | |
Strong Duality via Convex Conjugacy A Mishkin | | |
Solving Projection Problems using Lagrangian Duality A Mishkin | | |
Fast Convergence of Greedy 2-Coordinate Updates for Optimizing with an Equality Constraint AV Ramesh, A Mishkin, M Schmidt OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop), 0 | | |
The Solution Path of the Group Lasso A Mishkin, M Pilanci OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop), 0 | | |
How to make your optimizer generalize better S Vaswani, R Babenzhad, J Gallego, A Mishkin, S Lacoste-Julien, ... | | |