K Wang, H Wang, S Li - Knowledge-Based Systems, 2022 - Elsevier
Streaming data analysis has drawn much attention, where large amounts of data arrive in streams. Because limited memory can only store a small batch of data, fast analysis without …
Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms …
HY Wang, M Yang, J Stufken - Journal of the American Statistical …, 2019 - Taylor & Francis
Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large datasets due to …
L Zhou, Z Gong, P Xiang - Annual Review of Statistics and Its …, 2023 - annualreviews.org
Data are distributed across different sites due to computing facility limitations or data privacy considerations. Conventional centralized methods—those in which all datasets are stored …
J Fan, D Wang, K Wang, Z Zhu - Annals of statistics, 2019 - ncbi.nlm.nih.gov
Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored …
K Wang, B Zhang, F Alenezi, S Li - Information sciences, 2022 - Elsevier
Distributed system has been widely used to solve massive data analysis tasks. This article targets on quantile regression on distributed system with non-randomly distributed massive …
H Wang, Y Ma - Biometrika, 2021 - academic.oup.com
We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal …
Y Chen, J Fan, C Ma, Y Yan - Proceedings of the National …, 2019 - National Acad Sciences
Noisy matrix completion aims at estimating a low-rank matrix given only partial and corrupted entries. Despite remarkable progress in designing efficient estimation algorithms …