Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically …
The problem of on-line learning in two-layer neural networks is studied within the framework of statistical mechanics. A fully connected committee machine with K hidden units is trained …
M Opper, O Winther - On-line learning in neural networks, 1999 - research.aston.ac.uk
Online learning is discussed from the viewpoint of Bayesian statistical inference. By replacing the true posterior distribution with a simpler parametric distribution, one can define …
This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a two-layer neural network trained on Gaussian data and labels generated by a similar …
One may argue that the simplest type of neural networks beyond a single perceptron is an array of several perceptrons in parallel. In spite of their simplicity, such circuits can compute …
We describe an approach to understand the peculiar and counterintuitive generalization properties of deep neural networks. The approach involves going beyond worst-case …
P Riegler, M Biehl - Journal of Physics A: Mathematical and …, 1995 - iopscience.iop.org
We present an exact analysis of learning a rule by on-line gradient descent in a two-layered neural network with adjustable hidden-to-output weights (backpropagation of error). Results …
M Biehl, P Riegler, C Wöhler - Journal of Physics A: Mathematical …, 1996 - iopscience.iop.org
The dynamics of on-line learning in neural networks with continuous units is dominated by plateaux in the time dependence of the generalization error. Using tools from statistical …
I propose a general model of on-line learning from random examples which, when applied to a smooth realizable stochastic rule, yields the same asymptotic generalization error rate …