DESCINet: A hierarchical deep convolutional neural network with skip connection for long time series forecasting

AQB Silva, WN Gonçalves, ET Matsubara - Expert Systems with …, 2023 - Elsevier
Time series forecasting is the process of predicting future values of a time series from
knowledge of its past data. Although there are several models for making short-term …

[HTML][HTML] High-dimensional dynamics of generalization error in neural networks

MS Advani, AM Saxe, H Sompolinsky - Neural Networks, 2020 - Elsevier
We perform an analysis of the average generalization dynamics of large neural networks
trained using gradient descent. We study the practically-relevant “high-dimensional” regime …

Skip connections eliminate singularities

AE Orhan, X Pitkow - arXiv preprint arXiv:1701.09175, 2017 - arxiv.org
Skip connections made the training of very deep networks possible and have become an
indispensable component in a variety of neural architectures. A completely satisfactory …

[图书][B] Information geometry and its applications

S Amari - 2016 - books.google.com
This is the first comprehensive book on information geometry, written by the founder of the
field. It begins with an elementary introduction to dualistic geometry and proceeds to a wide …

Active learning of dynamics for data-driven control using Koopman operators

I Abraham, TD Murphey - IEEE Transactions on Robotics, 2019 - ieeexplore.ieee.org
This paper presents an active learning strategy for robotic systems that takes into account
task information, enables fast learning, and allows control to be readily synthesized by …

Micro-batch training with batch-channel normalization and weight standardization

S Qiao, H Wang, C Liu, W Shen, A Yuille - arXiv preprint arXiv:1903.10520, 2019 - arxiv.org
Batch Normalization (BN) has become an out-of-box technique to improve deep network
training. However, its effectiveness is limited for micro-batch training, ie, each GPU typically …

Stochastic collapse: How gradient noise attracts sgd dynamics towards simpler subnetworks

F Chen, D Kunin, A Yamamura… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives
overly expressive networks to much simpler subnetworks, thereby dramatically reducing the …

[图书][B] Algebraic geometry and statistical learning theory

S Watanabe - 2009 - books.google.com
Sure to be influential, Watanabe's book lays the foundations for the use of algebraic
geometry in statistical learning theory. Many models/machines are singular: mixture models …

Learning time-scales in two-layers neural networks

R Berthier, A Montanari, K Zhou - Foundations of Computational …, 2024 - Springer
Gradient-based learning in multi-layer neural networks displays a number of striking
features. In particular, the decrease rate of empirical risk is non-monotone even after …

Classification of malignant tumors in breast ultrasound using a pretrained deep residual network model and support vector machine

WC Shia, DR Chen - Computerized Medical Imaging and Graphics, 2021 - Elsevier
In this study, a transfer learning method was utilized to recognize and classify benign and
malignant breast tumors, using two-dimensional breast ultrasound (US) images, to decrease …