Z Li,
T Wang,
D Yu - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract We prove the Fast Equilibrium Conjecture proposed by Li et al.,(2020), ie,
stochastic gradient descent (SGD) on a scale-invariant loss (eg, using networks with various …