Multiple imputation: a review of practical and theoretical findings

JS Murray - 2018 - projecteuclid.org
Multiple imputation is a straightforward method for handling missing data in a principled
fashion. This paper presents an overview of multiple imputation, including important …

Synthetic data

TE Raghunathan - Annual review of statistics and its application, 2021 - annualreviews.org
Demand for access to data, especially data collected using public funds, is ever growing. At
the same time, concerns about the disclosure of the identities of and sensitive information …

Generating multi-label discrete patient records using generative adversarial networks

E Choi, S Biswal, B Malin, J Duke… - Machine learning …, 2017 - proceedings.mlr.press
Access to electronic health record (EHR) data has motivated computational advances in
medical research. However, various concerns, particularly over privacy, can limit access to …

Differential privacy: A primer for a non-technical audience

A Wood, M Altman, A Bembenek, M Bun… - Vand. J. Ent. & Tech …, 2018 - HeinOnline
Differential Privacy: A Primer for a Non-Technical Audience Page 1 Differential Privacy: A
Primer for a Non-Technical Audience Alexandra Wood, Micah Altman, Aaron Bembenek, Mark …

Balancing data privacy and usability in the federal statistical system

VJ Hotz, CR Bollinger, T Komarova… - Proceedings of the …, 2022 - National Acad Sciences
The federal statistical system is experiencing competing pressures for change. On the one
hand, for confidentiality reasons, much socially valuable data currently held by federal …

synthpop: Bespoke creation of synthetic data in R

B Nowok, GM Raab, C Dibben - Journal of statistical software, 2016 - jstatsoft.org
In many contexts, confidentiality constraints severely restrict access to unique and valuable
microdata. Synthetic data which mimic the original observed data and preserve the …

[PDF][PDF] Multiple imputation for statistical disclosure limitation

TE Raghunathan, JP Reiter, DB Rubin - Journal of official statistics, 2003 - stat.duke.edu
This article evaluates the use of the multiple imputation framework to protect the
confidentiality of respondents' answers in sample surveys. The basic proposal is to simulate …

[PDF][PDF] Using CART to generate partially synthetic public use microdata

JP Reiter - JOURNAL OF OFFICIAL STATISTICS-STOCKHOLM-, 2005 - fcsm.gov
To limit disclosure risks, one approach is to release partially synthetic, public use microdata
sets. These comprise the units originally surveyed, but some collected values, for example …

[HTML][HTML] Membership inference attacks against synthetic health data

Z Zhang, C Yan, BA Malin - Journal of biomedical informatics, 2022 - Elsevier
Synthetic data generation has emerged as a promising method to protect patient privacy
while sharing individual-level health data. Intuitively, sharing synthetic data should reduce …

[图书][B] Synthetic datasets for statistical disclosure control: theory and implementation

J Drechsler - 2011 - books.google.com
The aim of this book is to give the reader a detailed introduction to the different approaches
to generating multiply imputed synthetic datasets. It describes all approaches that have been …