Optimal Diagonal Preconditioning

A Jambulapati, J Li, C Musco… - Advances in …, 2024 - proceedings.neurips.cc

We develop a general framework for finding approximately-optimal preconditioners for
solving linear systems. Leveraging this framework we obtain improved runtimes for …

被引用次数：5 相关文章所有 9 个版本

[PDF] arxiv.org

Preconditioners for the Stochastic Training of Implicit Neural Representations

SF Chng, H Saratchandran, S Lucey - arXiv preprint arXiv:2402.08784, 2024 - arxiv.org

Implicit neural representations have emerged as a powerful technique for encoding complex
continuous multidimensional signals as neural networks, enabling a wide range of …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Spectrally constrained optimization

C Garner, G Lerman, S Zhang - Journal of Scientific Computing, 2024 - Springer

We investigate how to solve smooth matrix optimization problems with general linear
inequality constraints on the eigenvalues of a symmetric matrix. We present solution …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Weight Conditioning for Smooth Optimization of Neural Networks

H Saratchandran, TX Wang, S Lucey - European Conference on Computer …, 2025 - Springer

In this article, we introduce a novel normalization technique for neural network weight
matrices, which we term weight conditioning. This approach aims to narrow the gap between …

On Sinkhorn's Algorithm and Choice Modeling

Z Qu, A Galichon, J Ugander - arXiv preprint arXiv:2310.00260, 2023 - arxiv.org

For a broad class of choice and ranking models based on Luce's choice axiom, including the
Bradley--Terry--Luce and Plackett--Luce models, we show that the associated maximum …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

Convergence analysis of stochastic gradient descent with adaptive preconditioning for non-convex and convex functions

DA Pasechnyuk, A Gasnikov, M Takáč - arXiv preprint arXiv:2308.14192, 2023 - arxiv.org

Preconditioning is a crucial operation in gradient-based numerical optimisation. It helps
decrease the local condition number of a function by appropriately transforming its gradient …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Gradient Methods with Online Scaling

W Gao, YC Chu, Y Ye, M Udell - arXiv preprint arXiv:2411.01803, 2024 - arxiv.org

We introduce a framework to accelerate the convergence of gradient-based methods with
online learning. The framework learns to scale the gradient at each iteration through an …

Scalable Approximate Optimal Diagonal Preconditioning

W Gao, Z Qu, M Udell, Y Ye - arXiv preprint arXiv:2312.15594, 2023 - arxiv.org

We consider the problem of finding the optimal diagonal preconditioner for a positive definite
matrix. Although this problem has been shown to be solvable and various methods have …

Efficient nonlocal linear image denoising: Bilevel optimization with Nonequispaced Fast Fourier Transform and matrix-free preconditioning

A Miniguano-Trujillo, JW Pearson… - arXiv preprint arXiv …, 2024 - arxiv.org

We present a new approach for nonlocal image denoising, based around the application of
an unnormalized extended Gaussian ANOVA kernel within a bilevel optimization algorithm …

AdaGrad under Anisotropic Smoothness

Y Liu, R Pan, T Zhang - arXiv preprint arXiv:2406.15244, 2024 - arxiv.org

Adaptive gradient methods have been widely adopted in training large-scale deep neural
networks, especially large foundation models. Despite the huge success in practice, their …

高级搜索

QQ 群