Spectral methods for data science: A statistical perspective

Y Chen, Y Chi, J Fan, C Ma - Foundations and Trends® in …, 2021 - nowpublishers.com
Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …

A unifying tutorial on approximate message passing

OY Feng, R Venkataramanan, C Rush… - … and Trends® in …, 2022 - nowpublishers.com
Over the last decade or so, Approximate Message Passing (AMP) algorithms have become
extremely popular in various structured high-dimensional statistical problems. Although the …

Missing data imputation using optimal transport

B Muzellec, J Josse, C Boyer… - … Conference on Machine …, 2020 - proceedings.mlr.press
Missing data is a crucial issue when applying machine learning algorithms to real-world
datasets. Starting from the simple assumption that two batches extracted randomly from the …

Recent developments in factor models and applications in econometric learning

J Fan, K Li, Y Liao - Annual Review of Financial Economics, 2021 - annualreviews.org
This article provides a selective overview of the recent developments in factor models and
their applications in econometric learning. We focus on the perspective of the low-rank …

Matrix completion, counterfactuals, and factor analysis of missing data

J Bai, S Ng - Journal of the American Statistical Association, 2021 - Taylor & Francis
This article proposes an imputation procedure that uses the factors estimated from a tall
block along with the re-rotated loadings estimated from a wide block to impute missing …

Causal matrix completion

A Agarwal, M Dahleh, D Shah… - The thirty sixth annual …, 2023 - proceedings.mlr.press
Matrix completion is the study of recovering an underlying matrix from a sparse subset of
noisy observations. Traditionally, it is assumed that the entries of the matrix are “missing …

Subspace estimation from unbalanced and incomplete data matrices: statistical guarantees

C Cai, G Li, Y Chi, HV Poor, Y Chen - 2021 - projecteuclid.org
Subspace estimation from unbalanced and incomplete data matrices: l2,infty statistical
guarantees Page 1 The Annals of Statistics 2021, Vol. 49, No. 2, 944–967 https://doi.org/10.1214/20-AOS1986 …

Ensemble principal component analysis

O Dorabiala, AY Aravkin, JN Kutz - IEEE Access, 2024 - ieeexplore.ieee.org
Efficient representations of data are essential for processing, exploration, and human
understanding, and Principal Component Analysis (PCA) is one of the most common …

Missing not at random in matrix completion: The effectiveness of estimating missingness probabilities under a low nuclear norm assumption

W Ma, GH Chen - Advances in neural information …, 2019 - proceedings.neurips.cc
Matrix completion is often applied to data with entries missing not at random (MNAR). For
example, consider a recommendation system where users tend to only reveal ratings for …

Identification and semiparametric efficiency theory of nonignorable missing data with a shadow variable

W Miao, L Liu, Y Li, EJ Tchetgen Tchetgen… - ACM/JMS Journal of …, 2024 - dl.acm.org
We consider identification and estimation with an outcome missing not at random (MNAR).
We study an identification strategy based on a so-called shadow variable. A shadow …