Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning

K Kowsari, K Jafari Meimandi, M Heidarysafa, S Mendu… - Information, 2019 - mdpi.com

In recent years, there has been an exponential growth in the number of complex documents
and texts that require a deeper understanding of machine learning methods to be able to …

被引用次数：1756 相关文章所有 12 个版本

[HTML] nih.gov

Machine learning techniques for biomedical image segmentation: an overview of technical aspects and introduction to state‐of‐art applications

H Seo, M Badiei Khuzani, V Vasudevan… - Medical …, 2020 - Wiley Online Library

In recent years, significant progress has been made in developing more accurate and
efficient machine learning algorithms for segmentation of medical and natural images. In this …

被引用次数：245 相关文章所有 8 个版本

[PDF] cambridge.org

Deep learning: a statistical viewpoint

PL Bartlett, A Montanari, A Rakhlin - Acta numerica, 2021 - cambridge.org

The remarkable practical success of deep learning has revealed some major surprises from
a theoretical perspective. In particular, simple gradient methods easily find near-optimal …

被引用次数：309 相关文章所有 12 个版本

[PDF] arxiv.org

ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels

A Dempster, F Petitjean, GI Webb - Data Mining and Knowledge Discovery, 2020 - Springer

Most methods for time series classification that attain state-of-the-art accuracy have high
computational complexity, requiring significant training time even for smaller datasets, and …

被引用次数：781 相关文章所有 15 个版本

[PDF] mlr.press

Do imagenet classifiers generalize to imagenet?

B Recht, R Roelofs, L Schmidt… - … conference on machine …, 2019 - proceedings.mlr.press

We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have
been the focus of intense research for almost a decade, raising the danger of overfitting to …

被引用次数：1670 相关文章所有 6 个版本

[PDF] neurips.cc

On lazy training in differentiable programming

L Chizat, E Oyallon, F Bach - Advances in neural …, 2019 - proceedings.neurips.cc

In a series of recent theoretical works, it was shown that strongly over-parameterized neural
networks trained with gradient-based methods could converge exponentially fast to zero …

被引用次数：910 相关文章所有 14 个版本

[PDF] neurips.cc

What can a single attention layer learn? a study through the random features lens

H Fu, T Guo, Y Bai, S Mei - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Attention layers---which map a sequence of inputs to a sequence of outputs---are core
building blocks of the Transformer architecture which has achieved significant …

被引用次数：18 相关文章所有 6 个版本

[PDF] mlr.press

A theoretical analysis of deep Q-learning

J Fan, Z Wang, Y Xie, Z Yang - Learning for dynamics and …, 2020 - proceedings.mlr.press

Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …

被引用次数：737 相关文章所有 9 个版本

[PDF] arxiv.org

Randomized numerical linear algebra: Foundations and algorithms

PG Martinsson, JA Tropp - Acta Numerica, 2020 - cambridge.org

This survey describes probabilistic algorithms for linear algebraic computations, such as
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …

被引用次数：346 相关文章所有 16 个版本

[PDF] pnas.org

Explaining neural scaling laws

Y Bahri, E Dyer, J Kaplan, J Lee, U Sharma - Proceedings of the National …, 2024 - pnas.org

The population loss of trained deep neural networks often follows precise power-law scaling
relations with either the size of the training dataset or the number of parameters in the …

被引用次数：166 相关文章所有 5 个版本

高级搜索

QQ 群