Locally sparse neural networks for tabular biomedical data

J Yang, O Lindenbaum… - … Conference on Machine …, 2022 - proceedings.mlr.press
Tabular datasets with low-sample-size or many variables are prevalent in biomedicine.
Practitioners in this domain prefer linear or tree-based models over neural networks since …

Resolution of the curse of dimensionality in single-cell RNA sequencing data analysis

Y Imoto, T Nakamura, EG Escolar… - Life Science …, 2022 - life-science-alliance.org
Single-cell RNA sequencing (scRNA-seq) can determine gene expression in numerous
individual cells simultaneously, promoting progress in the biomedical sciences. However …

[HTML][HTML] Clustering by principal component analysis with Gaussian kernel in high-dimension, low-sample-size settings

Y Nakayama, K Yata, M Aoshima - Journal of Multivariate Analysis, 2021 - Elsevier
In this paper, we consider clustering based on the kernel principal component analysis
(KPCA) for high-dimension, low-sample-size (HDLSS) data. We give theoretical reasons …

Distance-based and RKHS-based dependence metrics in high dimension

C Zhu, X Zhang, S Yao, X Shao - The Annals of Statistics, 2020 - JSTOR
In this paper, we study distance covariance, Hilbert–Schmidt covariance (aka Hilbert–
Schmidt independence criterion [In Advances in Neural Information Processing Systems …

[图书][B] Object oriented data analysis

JS Marron, IL Dryden - 2021 - taylorfrancis.com
Object Oriented Data Analysis is a framework that facilitates inter-disciplinary research
through new terminology for discussing the often many possible approaches to the analysis …

Interpoint distance based two sample tests in high dimension

C Zhu, X Shao - 2021 - projecteuclid.org
Interpoint distance based two sample tests in high dimension Page 1 Bernoulli 27(2), 2021,
1189–1211 https://doi.org/10.3150/20-BEJ1270 Interpoint distance based two sample tests in …

Perturbation theory for cross data matrix-based PCA

SH Wang, SY Huang - Journal of Multivariate Analysis, 2022 - Elsevier
Principal component analysis (PCA) has long been a useful and important tool for
dimension reduction. However, this method must be used with care under certain …

Testing homogeneity: the trouble with sparse functional data

C Zhu, JL Wang - Journal of the Royal Statistical Society Series …, 2023 - academic.oup.com
Testing the homogeneity between two samples of functional data is an important task. While
this is feasible for intensely measured functional data, we explain why it is challenging for …

The dispersion bias

LR Goldberg, A Papanicolaou, A Shkolnik - SIAM Journal on Financial …, 2022 - SIAM
We identify and correct excess dispersion in the leading eigenvector of a sample covariance
matrix when the number of variables vastly exceeds the number of observations. Our …

Geometric consistency of principal component scores for high‐dimensional mixture models and its application

K Yata, M Aoshima - Scandinavian Journal of Statistics, 2020 - Wiley Online Library
In this article, we consider clustering based on principal component analysis (PCA) for high‐
dimensional mixture models. We present theoretical reasons why PCA is effective for …