F Chen,
D Kunin, A Yamamura… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives
overly expressive networks to much simpler subnetworks, thereby dramatically reducing the …