Concentration inequalities for statistical inference

H Zhang, SX Chen - arXiv preprint arXiv:2011.02258, 2020 - arxiv.org
This paper gives a review of concentration inequalities which are widely employed in non-
asymptotical analyses of mathematical statistics in a wide range of settings, from distribution …

Sharper sub-weibull concentrations

H Zhang, H Wei - Mathematics, 2022 - mdpi.com
Constant-specified and exponential concentration inequalities play an essential role in the
finite-sample theory of machine learning and high-dimensional statistics area. We obtain …

Non-asymptotic guarantees for robust statistical learning under infinite variance assumption

L Xu, F Yao, Q Yao, H Zhang - Journal of Machine Learning Research, 2023 - jmlr.org
There has been a surge of interest in developing robust estimators for models with heavy-
tailed and bounded variance data in statistics and machine learning, while few works …

Optimal decorrelated score subsampling for generalized linear models with massive data

J Gao, L Wang, H Lian - Science China Mathematics, 2024 - Springer
In this paper, we consider the unified optimal subsampling estimation and inference on the
low-dimensional parameter of main interest in the presence of the nuisance parameter for …

[PDF][PDF] Optimal subsampling algorithms for big data generalized linear models

M Ai, J Yu, H Zhang, H Wang - arXiv preprint arXiv:1806.06761, 2018 - researchgate.net
To fast approximate the maximum likelihood estimator with massive data, Wang et al.(JASA,
2017) proposed an optimal subsampling method under the A-optimality criterion (OSMAC) …

Weighted Lasso estimates for sparse logistic regression: Non-asymptotic properties with measurement errors

H Huang, Y Gao, H Zhang, B Li - Acta Mathematica Scientia, 2021 - Springer
For high-dimensional models with a focus on classification performance, the ℓ 1-penalized
logistic regression is becoming important and popular. However, the Lasso estimates could …

COM-negative binomial distribution: modeling overdispersion and ultrahigh zero-inflated count data

H Zhang, K Tan, B Li - Frontiers of Mathematics in China, 2018 - Springer
We focus on the COM-type negative binomial distribution with three parameters, which
belongs to COM-type (a, b, 0) class distributions and family of equilibrium distributions of …

Matrix regression heterogeneity analysis

F Zhang, S Zhang, SM Li, M Ren - Statistics and Computing, 2024 - Springer
The development of modern science and technology has facilitated the collection of a large
amount of matrix data in fields such as biomedicine. Matrix data modeling has been …

Asymptotics of Subsampling for Generalized Linear Regression Models under Unbounded Design

G Teng, B Tian, Y Zhang, S Fu - Entropy, 2022 - mdpi.com
The optimal subsampling is an statistical methodology for generalized linear models (GLMs)
to make inference quickly about parameter estimation in massive data regression. Existing …

High-dimensional prediction for count response via sparse exponential weights

TT Mai - arXiv preprint arXiv:2410.15381, 2024 - arxiv.org
Count data is prevalent in various fields like ecology, medical research, and genomics. In
high-dimensional settings, where the number of features exceeds the sample size, feature …