Recent advances in deep learning theory

F He, D Tao - arXiv preprint arXiv:2012.10931, 2020 - arxiv.org
Deep learning is usually described as an experiment-driven field under continuous criticizes
of lacking theoretical foundations. This problem has been partially fixed by a large volume of …

[图书][B] The principles of deep learning theory

DA Roberts, S Yaida, B Hanin - 2022 - cambridge.org
This textbook establishes a theoretical framework for understanding deep learning models
of practical relevance. With an approach that borrows from theoretical physics, Roberts and …

A probabilistic theory of deep learning

AB Patel, T Nguyen, RG Baraniuk - arXiv preprint arXiv:1504.00641, 2015 - arxiv.org
A grand challenge in machine learning is the development of computational algorithms that
match or outperform humans in perceptual inference tasks that are complicated by nuisance …

[图书][B] Deep learning: Foundations and concepts

CM Bishop, H Bishop - 2023 - books.google.com
This book offers a comprehensive introduction to the central ideas that underpin deep
learning. It is intended both for newcomers to machine learning and for those already …

The unbearable shallow understanding of deep learning

A Plebe, G Grasso - Minds and Machines, 2019 - Springer
This paper analyzes the rapid and unexpected rise of deep learning within Artificial
Intelligence and its applications. It tackles the possible reasons for this remarkable success …

Stochastic deep networks

G De Bie, G Peyré, M Cuturi - International Conference on …, 2019 - proceedings.mlr.press
Abstract Machine learning is increasingly targeting areas where input data cannot be
accurately described by a single vector, but can be modeled instead using the more flexible …

Reconciling modern deep learning with traditional optimization analyses: The intrinsic learning rate

Z Li, K Lyu, S Arora - Advances in Neural Information …, 2020 - proceedings.neurips.cc
Abstract Recent works (eg,(Li\& Arora, 2020)) suggest that the use of popular normalization
schemes (including Batch Normalization) in today's deep learning can move it far from a …

A diffusion theory for deep learning dynamics: Stochastic gradient descent exponentially favors flat minima

Z Xie, I Sato, M Sugiyama - arXiv preprint arXiv:2002.03495, 2020 - arxiv.org
Stochastic Gradient Descent (SGD) and its variants are mainstream methods for training
deep networks in practice. SGD is known to find a flat minimum that often generalizes well …

Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior

CH Martin, MW Mahoney - arXiv preprint arXiv:1710.09553, 2017 - arxiv.org
We describe an approach to understand the peculiar and counterintuitive generalization
properties of deep neural networks. The approach involves going beyond worst-case …

On neural differential equations

P Kidger - arXiv preprint arXiv:2202.02435, 2022 - arxiv.org
The conjoining of dynamical systems and deep learning has become a topic of great
interest. In particular, neural differential equations (NDEs) demonstrate that neural networks …