查看文章

mlr.press 中的 [PDF]

Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations

作者

Yiping Lu, Aoxiao Zhong, Quanzheng Li, Bin Dong

发表日期

2018/7/3

研讨会论文

International Conference on Machine Learning

页码范围

3276-3285

出版商

PMLR

简介

Deep neural networks have become the state-of-the-art models in numerous machine learning tasks. However, general guidance to network architecture design is still missing. In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (ie the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CIFAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CIFAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress (> 50%) the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve …

引用总数

被引用次数：570

201820192020202120222023202414 65 101 110 123 104 51

学术搜索中的文章

Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations

Y Lu, A Zhong, Q Li, B Dong - International Conference on Machine Learning, 2018

被引用次数：570 相关文章所有 13 个版本