We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio …
We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high …
Generalized cross-validation (GCV) is a widely used method for estimating the squared out- of-sample prediction risk that employs scalar degrees of freedom adjustment (in a …
Bagging is an important technique for stabilizing machine learning models. In this paper, we derive a finite-sample guarantee on the stability of bagging for any model. Our result places …
P Patil, D LeJeune - arXiv preprint arXiv:2310.04357, 2023 - arxiv.org
We employ random matrix theory to establish consistency of generalized cross validation (GCV) for estimating prediction risks of sketched ridge regression ensembles, enabling …
We characterize the squared prediction risk of ensemble estimators obtained through subagging (subsample bootstrap aggregating) regularized M-estimators and construct a …
We study the behavior of optimal ridge regularization and optimal ridge risk for out-of- distribution prediction, where the test distribution deviates arbitrarily from the train …
Ensemble methods such as bagging and random forests are ubiquitous in various fields, from finance to genomics. Despite their prevalence, the question of the efficient tuning of …
JH Du, P Patil - arXiv preprint arXiv:2408.15784, 2024 - arxiv.org
We study the implicit regularization effects induced by (observation) weighting of pretrained features. For weight and feature matrices of bounded operator norms that are infinitesimally …