关注
Depen Morwani
Depen Morwani
在 g.harvard.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Feature-learning networks are consistent across widths at realistic scales
N Vyas, A Atanasov, B Bordelon, D Morwani, S Sainathan, C Pehlevan
Advances in Neural Information Processing Systems 36, 2024
132024
Simplicity bias in 1-hidden layer neural networks
D Morwani, J Batra, P Jain, P Netrapalli
Advances in Neural Information Processing Systems 36, 2024
82024
Feature emergence via margin maximization: case studies in algebraic tasks
D Morwani, BL Edelman, CA Oncescu, R Zhao, S Kakade
arXiv preprint arXiv:2311.07568, 2023
62023
Inductive bias of gradient descent for weight normalized smooth homogeneous neural nets
D Morwani, HG Ramaswamy
International Conference on Algorithmic Learning Theory, 827-880, 2022
32022
Beyond implicit bias: The insignificance of sgd noise in online learning
N Vyas, D Morwani, R Zhao, G Kaplun, S Kakade, B Barak
arXiv preprint arXiv:2306.08590, 2023
22023
Using noise resilience for ranking generalization of deep neural networks
D Morwani, R Vashisht, HG Ramaswamy
arXiv preprint arXiv:2012.08854, 2020
22020
Inductive bias of gradient descent for exponentially weight normalized smooth homogeneous neural nets
D Morwani, HG Ramaswamy
arXiv preprint arXiv:2010.12909, 2020
12020
Deconstructing What Makes a Good Optimizer for Language Models
R Zhao, D Morwani, D Brandfonbrener, N Vyas, S Kakade
arXiv preprint arXiv:2407.07972, 2024
2024
A New Perspective on Shampoo's Preconditioner
D Morwani, I Shapira, N Vyas, E Malach, S Kakade, L Janson
arXiv preprint arXiv:2406.17748, 2024
2024
AdaMeM: Memory Efficient Momentum for Adafactor
N Vyas, D Morwani, SM Kakade
2nd Workshop on Advancing Neural Network Training: Computational Efficiency …, 0
系统目前无法执行此操作,请稍后再试。
文章 1–10