E Collins-Woodfin, C Paquette… - … and Inference: A …, 2024 - academic.oup.com
We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high- dimensional limit when applied to generalized linear models and multi-index models (eg …
The cost of hyperparameter tuning in deep learning has been rising with model sizes, prompting practitioners to find new tuning methods using a proxy of smaller networks. One …
Reinforcement learning has been successful across several applications in which agents have to learn to act in environments with sparse feedback. However, despite this empirical …
The use of mini-batches of data in training artificial neural networks is nowadays very common. Despite its broad usage, theories explaining quantitatively how large or small the …
H Cui - arXiv preprint arXiv:2409.13904, 2024 - arxiv.org
Recent years have been marked with the fast-pace diversification and increasing ubiquity of machine learning applications. Yet, a firm theoretical understanding of the surprising …
We study the dynamics in high dimensions of online stochastic gradient descent for the multi- spiked tensor model. This multi-index model arises from the tensor principal component …
We develop a solvable model of neural scaling laws beyond the kernel limit. Theoretical analysis of this model shows how performance scales with model size, training time, and the …
We investigate the test risk of continuous-time stochastic gradient flow dynamics in learning theory. Using a path integral formulation we provide, in the regime of a small learning rate, a …
We study the dynamics of two local optimization algorithms, online stochastic gradient descent (SGD) and gradient flow, within the framework of the multi-spiked tensor model in …