Generalized cross-validation (GCV) is a widely used method for estimating the squared out- of-sample prediction risk that employs scalar degrees of freedom adjustment (in a …
P Patil, D LeJeune - arXiv preprint arXiv:2310.04357, 2023 - arxiv.org
We employ random matrix theory to establish consistency of generalized cross validation (GCV) for estimating prediction risks of sketched ridge regression ensembles, enabling …
We characterize the squared prediction risk of ensemble estimators obtained through subagging (subsample bootstrap aggregating) regularized M-estimators and construct a …
We study the behavior of optimal ridge regularization and optimal ridge risk for out-of- distribution prediction, where the test distribution deviates arbitrarily from the train …
Dataset distillation (DD) is an increasingly important technique that focuses on constructing a synthetic dataset capable of capturing the core information in training data to achieve …
JH Du, P Patil - arXiv preprint arXiv:2408.15784, 2024 - arxiv.org
We study the implicit regularization effects induced by (observation) weighting of pretrained features. For weight and feature matrices of bounded operator norms that are infinitesimally …
Common practice in modern machine learning involves fitting a large number of parameters relative to the number of observations. These overparameterized models can exhibit …
This thesis delves into the spectral properties of covariance matrices and investigates the statistical behavior of Ridge estimators in high-dimensional settings. The first part focuses on …
In this dissertation, we present several forays into the complexity that characterizes modern machine learning, with a focus on the interplay between learning processes, incentives, and …