Objectives
A major open question, affecting the decisions of policy makers, is the estimation of the true number of Covid-19 infections. Most of them are undetected, because of a large number of asymptomatic cases. We provide an efficient, easy to compute and robust lower bound estimator for the number of undetected cases.
Methods
A modified version of the Chao estimator is proposed, based on the cumulative time-series distributions of cases and deaths. Heterogeneity has been addressed by assuming a geometrical distribution underlying the data generation process. An (approximated) analytical variance of the estimator has been derived to compute reliable confidence intervals at 95% level.
Results
A motivating application to the Austrian situation is provided and compared with an independent and representative study on prevalence of Covid-19 infection. Our estimates match well with the results from the independent prevalence study, but the capture–recapture estimate has less uncertainty involved as it is based on a larger sample size. Results from other European countries are mentioned in the discussion. The estimated ratio of the total estimated cases to the observed cases is around the value of 2.3 for all the analyzed countries.
Conclusions
The proposed method answers to a fundamental open question: “How many undetected cases are going around?”. CR methods provide a straightforward solution to shed light on undetected cases, incorporating heterogeneity that may arise in the probability of being detected.