[HTML][HTML] Smart healthcare disease diagnosis and patient management: Innovation, improvement and skill development

A Ray, AK Chaudhuri - Machine Learning with Applications, 2021 - Elsevier
Data mining (DM) is an instrument of pattern detection and retrieval of knowledge from a
large quantity of data. Many robust early detection services and other health-related …

Stochastic gradient descent as approximate bayesian inference

M Stephan, MD Hoffman, DM Blei - Journal of Machine Learning …, 2017 - jmlr.org
Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a
Markov chain with a stationary distribution. With this perspective, we derive several new …

New insights and perspectives on the natural gradient method

J Martens - Journal of Machine Learning Research, 2020 - jmlr.org
Natural gradient descent is an optimization method traditionally motivated from the
perspective of information geometry, and works well for many applications as an alternative …

The step decay schedule: A near optimal, geometrically decaying learning rate procedure for least squares

R Ge, SM Kakade, R Kidambi… - Advances in neural …, 2019 - proceedings.neurips.cc
Minimax optimal convergence rates for numerous classes of stochastic convex optimization
problems are well characterized, where the majority of results utilize iterate averaged …

Stochastic modified equations and dynamics of stochastic gradient algorithms i: Mathematical foundations

Q Li, C Tai, E Weinan - Journal of Machine Learning Research, 2019 - jmlr.org
We develop the mathematical foundations of the stochastic modified equations (SME)
framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is …

Bridging the gap between constant step size stochastic gradient descent and markov chains

A Dieuleveut, A Durmus, F Bach - 2020 - projecteuclid.org
Bridging the gap between constant step size stochastic gradient descent and Markov chains
Page 1 The Annals of Statistics 2020, Vol. 48, No. 3, 1348–1382 https://doi.org/10.1214/19-AOS1850 …

The implicit regularization of dynamical stability in stochastic gradient descent

L Wu, WJ Su - International Conference on Machine …, 2023 - proceedings.mlr.press
In this paper, we study the implicit regularization of stochastic gradient descent (SGD)
through the lens of dynamical stability (Wu et al., 2018). We start by revising existing stability …

[图书][B] Learning theory from first principles

F Bach - 2024 - di.ens.fr
This draft textbook is extracted from lecture notes from a class that I have taught
(unfortunately online, but this gave me an opportunity to write more detailed notes) during …

Solving empirical risk minimization in the current matrix multiplication time

YT Lee, Z Song, Q Zhang - Conference on Learning Theory, 2019 - proceedings.mlr.press
Many convex problems in machine learning and computer science share the same
form:\begin {align*}\min_ {x}\sum_ {i} f_i (A_i x+ b_i),\end {align*} where $ f_i $ are convex …

Parallelizing stochastic gradient descent for least squares regression: mini-batching, averaging, and model misspecification

P Jain, SM Kakade, R Kidambi, P Netrapalli… - Journal of machine …, 2018 - jmlr.org
This work characterizes the benefits of averaging techniques widely used in conjunction with
stochastic gradient descent (SGD). In particular, this work presents a sharp analysis of:(1) …