Span: A stochastic projected approximate newton method

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Span: A stochastic projected approximate newton method

在引用文章中搜索

[PDF] neurips.cc

Stochastic Anderson mixing for nonconvex stochastic optimization

F Wei, C Bao, Y Liu - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Anderson mixing (AM) is an acceleration method for fixed-point iterations. Despite its
success and wide usage in scientific computing, the convergence theory of AM remains …

被引用次数：21 相关文章所有 10 个版本

[PDF] arxiv.org

Eigencurve: Optimal learning rate schedule for sgd on quadratic objectives with skewed hessian spectrums

R Pan, H Ye, T Zhang - arXiv preprint arXiv:2110.14109, 2021 - arxiv.org

Learning rate schedulers have been widely adopted in training deep neural networks.
Despite their practical importance, there is a discrepancy between its practice and its …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN

Y Zhao, L Jiang, M Gao, N Jing, C Gu, Q Tang… - arXiv preprint arXiv …, 2022 - arxiv.org

The second-order training methods can converge much faster than first-order optimizers in
DNN training. This is because the second-order training utilizes the inversion of the second …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Minimizing oracle-structured composite functions

X Shen, A Ali, S Boyd - Optimization and Engineering, 2023 - Springer

We consider the problem of minimizing a composite convex function with two different
access methods: an oracle, for which we can evaluate the value and gradient, and a …

被引用次数：3 相关文章所有 8 个版本

高级搜索

QQ 群

Span: A stochastic projected approximate newton method

Stochastic Anderson mixing for nonconvex stochastic optimization

Eigencurve: Optimal learning rate schedule for sgd on quadratic objectives with skewed hessian spectrums

RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN

Minimizing oracle-structured composite functions

引用