[HTML][HTML] Text classification algorithms: A survey

K Kowsari, K Jafari Meimandi, M Heidarysafa, S Mendu… - Information, 2019 - mdpi.com
In recent years, there has been an exponential growth in the number of complex documents
and texts that require a deeper understanding of machine learning methods to be able to …

Machine learning techniques for biomedical image segmentation: an overview of technical aspects and introduction to state‐of‐art applications

H Seo, M Badiei Khuzani, V Vasudevan… - Medical …, 2020 - Wiley Online Library
In recent years, significant progress has been made in developing more accurate and
efficient machine learning algorithms for segmentation of medical and natural images. In this …

Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org
The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels

A Dempster, F Petitjean, GI Webb - Data Mining and Knowledge Discovery, 2020 - Springer
Most methods for time series classification that attain state-of-the-art accuracy have high
computational complexity, requiring significant training time even for smaller datasets, and …

Do imagenet classifiers generalize to imagenet?

B Recht, R Roelofs, L Schmidt… - … conference on machine …, 2019 - proceedings.mlr.press
We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have
been the focus of intense research for almost a decade, raising the danger of overfitting to …

On lazy training in differentiable programming

L Chizat, E Oyallon, F Bach - Advances in neural …, 2019 - proceedings.neurips.cc
In a series of recent theoretical works, it was shown that strongly over-parameterized neural
networks trained with gradient-based methods could converge exponentially fast to zero …

What can a single attention layer learn? a study through the random features lens

H Fu, T Guo, Y Bai, S Mei - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Attention layers---which map a sequence of inputs to a sequence of outputs---are core
building blocks of the Transformer architecture which has achieved significant …

A theoretical analysis of deep Q-learning

J Fan, Z Wang, Y Xie, Z Yang - Learning for dynamics and …, 2020 - proceedings.mlr.press
Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …

Randomized numerical linear algebra: Foundations and algorithms

PG Martinsson, JA Tropp - Acta Numerica, 2020 - cambridge.org
This survey describes probabilistic algorithms for linear algebraic computations, such as
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …

Explaining neural scaling laws

Y Bahri, E Dyer, J Kaplan, J Lee, U Sharma - Proceedings of the National …, 2024 - pnas.org
The population loss of trained deep neural networks often follows precise power-law scaling
relations with either the size of the training dataset or the number of parameters in the …