Examining the robustness of fully synthetic data techniques for data with binary variables

GJ Matthews, O Harel, RH Aseltine - Journal of Statistical …, 2010 - Taylor & Francis
There is a growing demand for public use data while at the same time there are increasing
concerns about the privacy of personal information. One proposed method for …

Methods for Synthetic Data Generation

J Snoke, SK Kinney - Handbook of Sharing Confidential Data, 2024 - taylorfrancis.com
In order to understand the methods for generating synthetic data, it is important to start with
an understanding of the basis from which these methods have arisen. Viewed purely from …

Evolution on the Generation and Analysis of Single Imputation Synthetic Datasets in Statistical Disclosure Control

R Moura, CA Coelho, B Sinha - Statistical Modeling and Applications …, 2024 - Springer
We present an overview of the evolution of single imputation synthetic datasets in Statistical
Disclosure Control (SDC). Imputation is a widely used technique for generating privacy …

Steve the matchmaker: the marriage of statistics and computer science in the world of data privacy

A Slavkovic - CHANCE, 2013 - Taylor & Francis
At this year's session in honor of Steve Fienberg's 70th birthday at the Joint Statistical
Meetings in Montréal, I had the the privilege to speak and to reflect on his contributions to …

How can we analyze differentially-private synthetic datasets?

AS Charest - Journal of Privacy and Confidentiality, 2011 - journalprivacyconfidentiality.org
Synthetic datasets generated within the multiple imputation framework are now commonly
used by statistical agencies to protect the confidentiality of their respondents. More recently …

Using multiple imputation to integrate and disseminate confidential microdata

JP Reiter - International Statistical Review, 2009 - Wiley Online Library
In data integration contexts, two statistical agencies seek to merge their separate databases
into one file. The agencies also may seek to disseminate data to the public based on the …

Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality

JP Reiter, J Drechsler - Statistica Sinica, 2010 - JSTOR
To protect the confidentiality of survey respondents' identities and sensitive attributes,
statistical agencies can release data in which confidential values are replaced with multiple …

[HTML][HTML] Fully synthetic data for complex surveys

S Mathur, Y Si, JP Reiter - Survey methodology, 2024 - pmc.ncbi.nlm.nih.gov
When seeking to release public use files for confidential data, statistical agencies can
generate fully synthetic data. We propose an approach for making fully synthetic data from …

Generating multiply imputed synthetic datasets: theory and implementation

J Drechsler - 2010 - fis.uni-bamberg.de
The book describes different approaches to generating multiply imputed synthetic datasets
to guarantee confidentiality. Each chapter is dedicated to one approach, first describing the …

Distribution-preserving statistical disclosure limitation

SD Woodcock, G Benedetto - Computational Statistics & Data Analysis, 2009 - Elsevier
One approach to limiting disclosure risk in public-use microdata is to release multiply-
imputed, partially synthetic data sets. These are data on actual respondents, but with …