Neural optimizers with hypergradients for tuning parameter-wise learning rates- 学术资源搜索

[PDF][PDF] Neural optimizers with hypergradients for tuning parameter-wise learning rates

J Fu, R Ng, D Chen, I Ilievski, C Pal… - JMLR: workshop and …, 2017 - researchgate.net

J Fu, R Ng, D Chen, I Ilievski, C Pal, TS Chua

JMLR: workshop and conference proceedings, 2017•researchgate.net

Abstract

Recent studies show that LSTM-based neural optimizers are competitive with state-of-theart hand-designed optimization methods for short horizons. Existing neural optimizers learn how to update the optimizee parameters, namely, predicting the product of learning rates and gradients directly and we suspect it is the reason why the training task becomes unnecessarily difficult. Instead, we train a neural optimizer to only control the learning rates of another optimizer using gradients of the training loss with respect to the learning rates. Furthermore, with the assumption that learning rates tend to remain unchanged over a certain number of iterations, the neural optimizer is only allowed to propose learning rates every S iterations where the learning rates are fixed during these S iterations and this enables it to generalize to longer horizons. The optimizee is trained by Adam on MNIST, and our neural optimizer learns to tune the learning rates for the Adam. After 5 meta-iterations, another optimizee trained by Adam whose learning rates are tuned by the learned but frozen neural optimizer, outperforms those trained by existing hand-designed and learned neural optimizers in terms of convergence rate and final accuracy for long horizons across several datasets.

researchgate.net

展开收起

被引用次数：3 相关文章所有 3 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

[PDF][PDF] Neural optimizers with hypergradients for tuning parameter-wise learning rates

引用