Priors in bayesian deep learning: A review

V Fortuin - International Statistical Review, 2022 - Wiley Online Library
While the choice of prior is one of the most critical parts of the Bayesian inference workflow,
recent Bayesian deep learning models have often fallen back on vague priors, such as …

Optimization for deep learning: An overview

RY Sun - Journal of the Operations Research Society of China, 2020 - Springer
Optimization is a critical component in deep learning. We think optimization for neural
networks is an interesting topic for theoretical research due to various reasons. First, its …

Deep learning on a data diet: Finding important examples early in training

M Paul, S Ganguli… - Advances in neural …, 2021 - proceedings.neurips.cc
Recent success in deep learning has partially been driven by training increasingly
overparametrized networks on ever larger datasets. It is therefore natural to ask: how much …

Dataset distillation with infinitely wide convolutional networks

T Nguyen, R Novak, L Xiao… - Advances in Neural …, 2021 - proceedings.neurips.cc
The effectiveness of machine learning algorithms arises from being able to extract useful
features from large amounts of data. As model and dataset sizes increase, dataset …

Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation

M Belkin - Acta Numerica, 2021 - cambridge.org
In the past decade the mathematical theory of machine learning has lagged far behind the
triumphs of deep neural networks on practical challenges. However, the gap between theory …

How neural networks extrapolate: From feedforward to graph neural networks

K Xu, M Zhang, J Li, SS Du, K Kawarabayashi… - arXiv preprint arXiv …, 2020 - arxiv.org
We study how neural networks trained by gradient descent extrapolate, ie, what they learn
outside the support of the training distribution. Previous works report mixed empirical results …

Task arithmetic in the tangent space: Improved editing of pre-trained models

G Ortiz-Jimenez, A Favero… - Advances in Neural …, 2024 - proceedings.neurips.cc
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-
trained models directly in weight space: By adding the fine-tuned weights of different tasks …

Differentially private learning needs better features (or much more data)

F Tramer, D Boneh - arXiv preprint arXiv:2011.11660, 2020 - arxiv.org
We demonstrate that differentially private machine learning has not yet reached its" AlexNet
moment" on many canonical vision tasks: linear models trained on handcrafted features …

Finite versus infinite neural networks: an empirical study

J Lee, S Schoenholz, J Pennington… - Advances in …, 2020 - proceedings.neurips.cc
We perform a careful, thorough, and large scale empirical study of the correspondence
between wide neural networks and kernel methods. By doing so, we resolve a variety of …

A kernel-based view of language model fine-tuning

S Malladi, A Wettig, D Yu, D Chen… - … on Machine Learning, 2023 - proceedings.mlr.press
It has become standard to solve NLP tasks by fine-tuning pre-trained language models
(LMs), especially in low-data settings. There is minimal theoretical understanding of …