The interpolating information criterion for overparameterized models

P Qing, C Gao, Y Zhou, X Diao, Y Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), are known
to enhance training efficiency in Large Language Models (LLMs). Due to the limited …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

SC Mouli, DC Maddix, S Alizadeh, G Gupta… - arXiv preprint arXiv …, 2024 - arxiv.org

Existing work in scientific machine learning (SciML) has shown that data-driven learning of
solution operators can provide a fast approximate alternative to classical numerical partial …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Temperature Optimization for Bayesian Deep Learning

K Ng, C van der Heide, L Hodgkinson, S Wei - arXiv preprint arXiv …, 2024 - arxiv.org

The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where
tempering the posterior to a cold temperature often improves the predictive performance of …

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

V Kothapalli, T Pang, S Deng, Z Liu, Y Yang - arXiv preprint arXiv …, 2024 - arxiv.org

Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT)
spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

A PAC-Bayesian Perspective on the Interpolating Information Criterion

L Hodgkinson, C van der Heide, R Salomone… - arXiv preprint arXiv …, 2023 - arxiv.org

Deep learning is renowned for its theory-practice gap, whereby principled theory typically
fails to provide much beneficial guidance for implementation in practice. This has been …

Gibbs-Based Information Criteria and the Over-Parameterized Regime

H Chen, GW Wornell, Y Bu - International Conference on …, 2024 - proceedings.mlr.press

Double-descent refers to the unexpected drop in test loss of a learning algorithm beyond an
interpolating threshold with over-parameterization, which is not predicted by information …

被引用次数：1 相关文章所有 4 个版本

[HTML] diva-portal.org

[HTML][HTML] On Implicit Smoothness Regularization in Deep Learning

M Gamba - 2024 - diva-portal.org

State of the art neural networks provide a rich class of function approximators, fueling the
remarkable success of gradient-based deep learning on complex high-dimensional …

[HTML][HTML] Using uncertainty quantification to characterize and improve out-of-domain learning for PDEs

SC Mouli, DM Robinson, S Alizadeh, G Gupta, A Stuart… - 2024 - amazon.science

Existing work in scientific machine learning (SciML) has shown that data-driven learning of
solution operators can provide a fast approximate alternative to classical numerical partial …

[PDF][PDF] AN ASYMPTOTICALLY OPTIMAL METHOD FOR CONSTRAINED STOCHASTIC OPTIMIZATION BY SEN NA, YIHANG GAO 2, MICHAEL K. NG 3, AND …

SEN NA, Y GAO, MK NG - senna1128.github.io

We perform statistical inference for the solution of stochastic optimization problems with
equality and box inequality constraints. The considered problems are prevalent in statistics …

高级搜索

QQ 群