Taylorized training: Towards better approximation of neural network training at finite width

Y Bai, B Krause, H Wang, C Xiong, R Socher - arXiv preprint arXiv …, 2020 - arxiv.org
We propose\emph {Taylorized training} as an initiative towards better understanding neural
network training at finite width. Taylorized training involves training the $ k $-th order Taylor …

Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width

Y Bai, B Krause, H Wang, C Xiong, R Socher - arXiv e-prints, 2020 - ui.adsabs.harvard.edu
We propose\emph {Taylorized training} as an initiative towards better understanding neural
network training at finite width. Taylorized training involves training the $ k $-th order Taylor …