We investigate theoretically how the features of a two-layer neural network adapt to the structure of the target function through a few large batch gradient descent steps, leading to …
We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data. Multi-index functions consist of a composition of an unknown low-rank linear …
We investigate the training dynamics of two-layer neural networks when learning multi-index target functions. We focus on multi-pass gradient descent (GD) that reuses the batches …
In modern deep learning, algorithmic choices (such as width, depth, and learning rate) are known to modulate nuanced resource tradeoffs. This work investigates how these …
B Simsek, A Bendjeddou… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Any continuous function $ f^* $ can be approximated arbitrarily well by a neural network with sufficiently many neurons $ k $. We consider the case when $ f^* $ itself is a …
We study active learning methods for single index models of the form $ F ({\bm x})= f (⟨{\bm w},{\bm x}⟩) $, where $ f:\mathbb {R}\to\mathbb {R} $ and ${\bx,\bm w}\in\mathbb {R}^ d $. In …
E Collins-Woodfin, C Paquette… - … and Inference: A …, 2024 - academic.oup.com
We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high- dimensional limit when applied to generalized linear models and multi-index models (eg …
An increasingly popular machine learning paradigm is to pretrain a neural network (NN) on many tasks offline, then adapt it to downstream tasks, often by re-training only the last linear …
This work investigates the nuanced algorithm design choices for deep learning in the presence of computational-statistical gaps. We begin by considering offline sparse parity …