Dataset distillation using neural feature regression

Y Zhou, E Nezhadarya, J Ba - Advances in Neural …, 2022 - proceedings.neurips.cc
Dataset distillation aims to learn a small synthetic dataset that preserves most of the
information from the original dataset. Dataset distillation can be formulated as a bi-level …

Theseus: A library for differentiable nonlinear optimization

L Pineda, T Fan, M Monge… - Advances in …, 2022 - proceedings.neurips.cc
We present Theseus, an efficient application-agnostic open source library for differentiable
nonlinear least squares (DNLS) optimization built on PyTorch, providing a common …

Data distillation: A survey

N Sachdeva, J McAuley - arXiv preprint arXiv:2301.04272, 2023 - arxiv.org
The popularity of deep learning has led to the curation of a vast number of massive and
multifarious datasets. Despite having close-to-human performance on individual tasks …

Dataset distillation with convexified implicit gradients

N Loo, R Hasani, M Lechner… - … Conference on Machine …, 2023 - proceedings.mlr.press
We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art …

The elements of differentiable programming

M Blondel, V Roulet - arXiv preprint arXiv:2403.14606, 2024 - arxiv.org
Artificial intelligence has recently experienced remarkable advances, fueled by large
models, vast datasets, accelerated hardware, and, last but not least, the transformative …

Velo: Training versatile learned optimizers by scaling up

L Metz, J Harrison, CD Freeman, A Merchant… - arXiv preprint arXiv …, 2022 - arxiv.org
While deep learning models have replaced hand-designed features across many domains,
these models are still trained with hand-designed optimizers. In this work, we leverage the …

evosax: Jax-based evolution strategies

RT Lange - Proceedings of the Companion Conference on Genetic …, 2023 - dl.acm.org
The deep learning revolution has greatly been accelerated by the'hardware lottery': Recent
advances in modern hardware accelerators, compilers and the availability of open-source …

Re-parameterizing your optimizers rather than architectures

X Ding, H Chen, X Zhang, K Huang, J Han… - arXiv preprint arXiv …, 2022 - arxiv.org
The well-designed structures in neural networks reflect the prior knowledge incorporated
into the models. However, though different models have various priors, we are used to …

Tutorial on amortized optimization

B Amos - Foundations and Trends® in Machine Learning, 2023 - nowpublishers.com
Optimization is a ubiquitous modeling tool and is often deployed in settings which
repeatedly solve similar instances of the same problem. Amortized optimization methods …

Discovering evolution strategies via meta-black-box optimization

R Lange, T Schaul, Y Chen, T Zahavy… - Proceedings of the …, 2023 - dl.acm.org
Optimizing functions without access to gradients is the remit of black-box methods such as
evolution strategies. While highly general, their learning dynamics are often times heuristic …