Challenges of big data analysis

J Fan, F Han, H Liu - National science review, 2014 - academic.oup.com
Big Data bring new opportunities to modern society and challenges to data scientists. On the
one hand, Big Data hold great promises for discovering subtle population patterns and …

Random matrix theory in statistics: A review

D Paul, A Aue - Journal of Statistical Planning and Inference, 2014 - Elsevier
We give an overview of random matrix theory (RMT) with the objective of highlighting the
results and concepts that have a growing impact in the formulation and inference of …

Persistent-homology-based machine learning: a survey and a comparative study

CS Pun, SX Lee, K Xia - Artificial Intelligence Review, 2022 - Springer
A suitable feature representation that can both preserve the data intrinsic information and
reduce data complexity and dimensionality is key to the performance of machine learning …

[图书][B] Statistical foundations of data science

J Fan, R Li, CH Zhang, H Zou - 2020 - taylorfrancis.com
Statistical Foundations of Data Science gives a thorough introduction to commonly used
statistical models, contemporary statistical machine learning techniques and algorithms …

High-dimensional asymptotics of prediction: Ridge regression and classification

E Dobriban, S Wager - The Annals of Statistics, 2018 - JSTOR
We provide a unified analysis of the predictive risk of ridge regression and regularized
discriminant analysis in a dense random effects model. We work in a high-dimensional …

Finite-sample analysis of interpolating linear classifiers in the overparameterized regime

NS Chatterji, PM Long - Journal of Machine Learning Research, 2021 - jmlr.org
We prove bounds on the population risk of the maximum margin algorithm for two-class
linear classification. For linearly separable training data, the maximum margin algorithm has …

Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation

TT Cai, Z Ren, HH Zhou - 2016 - projecteuclid.org
This is an expository paper that reviews recent developments on optimal estimation of
structured high-dimensional covariance and precision matrices. Minimax rates of …

Direct estimation of differential networks

SD Zhao, TT Cai, H Li - Biometrika, 2014 - academic.oup.com
It is often of interest to understand how the structure of a genetic network differs between two
conditions. In this paper, each condition-specific network is modelled using the precision …

Statistical workflow for feature selection in human metabolomics data

J Antonelli, BL Claggett, M Henglin, A Kim, G Ovsak… - Metabolites, 2019 - mdpi.com
High-throughput metabolomics investigations, when conducted in large human cohorts,
represent a potentially powerful tool for elucidating the biochemical diversity underlying …

Cardinality minimization, constraints, and regularization: a survey

AM Tillmann, D Bienstock, A Lodi… - arXiv preprint arXiv …, 2021 - arxiv.org
We survey optimization problems that involve the cardinality of variable vectors in
constraints or the objective function. We provide a unified viewpoint on the general problem …