查看文章

neurips.cc 中的 [PDF]

Gradient-based Hyperparameter Optimization Over Long Horizons

作者

Paul Micaelli, Amos J Storkey

发表日期

2021/12/6

研讨会论文

Advances in Neural Information Processing Systems

卷号

简介

Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online, but this introduces greediness which comes with a significant performance drop. We propose forward-mode differentiation with sharing (FDS), a simple and efficient algorithm which tackles memory scaling issues with forward-mode differentiation, and gradient degradation issues by sharing hyperparameters that are contiguous in time. We provide theoretical guarantees about the noise reduction properties of our algorithm, and demonstrate its efficiency empirically by differentiating through gradient steps of unrolled optimization. We consider large hyperparameter search ranges on CIFAR-10 where we significantly outperform greedy gradient-based alternatives, while achieving speedups compared to the state-of-the-art black-box methods.

引用总数

被引用次数：28

202020212022202320241 5 8 6 8

学术搜索中的文章

Non-greedy gradient-based hyperparameter optimization over long horizons*

P Micaelli, A Storkey - 2020

P Micaelli, AJ Storkey - Advances in Neural Information Processing Systems, 2021

被引用次数：9 相关文章所有 7 个版本