Survey on privacy-preserving techniques for microdata publication

T Carvalho, N Moniz, P Faria, L Antunes - ACM Computing Surveys, 2023 - dl.acm.org
The exponential growth of collected, processed, and shared microdata has given rise to
concerns about individuals' privacy. As a result, laws and regulations have emerged to …

A multi-dimensional evaluation of synthetic data generators

FK Dankar, MK Ibrahim, L Ismail - IEEE Access, 2022 - ieeexplore.ieee.org
Synthetic datasets are gradually emerging as solutions for data sharing. Multiple synthetic
data generators have been introduced in the last decade fueled by advancement in machine …

Fake it till you make it: Guidelines for effective synthetic data generation

FK Dankar, M Ibrahim - Applied Sciences, 2021 - mdpi.com
Synthetic data provides a privacy protecting mechanism for the broad usage and sharing of
healthcare data for secondary purposes. It is considered a safe approach for the sharing of …

Synthetic tabular data evaluation in the health domain covering resemblance, utility, and privacy dimensions

M Hernadez, G Epelde, A Alberdi… - … of information in …, 2023 - thieme-connect.com
Background Synthetic tabular data generation is a potentially valuable technology with great
promise for data augmentation and privacy preservation. However, prior to adoption, an …

When AI meets information privacy: The adversarial role of AI in data sharing scenario

A Majeed, SO Hwang - IEEE Access, 2023 - ieeexplore.ieee.org
Artificial intelligence (AI) is a transformative technology with a substantial number of practical
applications in commercial sectors such as healthcare, finance, aviation, and smart cities. AI …

Incorporation of synthetic data generation techniques within a controlled data processing workflow in the health and wellbeing domain

M Hernandez, G Epelde, A Beristain, R Álvarez… - Electronics, 2022 - mdpi.com
To date, the use of synthetic data generation techniques in the health and wellbeing domain
has been mainly limited to research activities. Although several open source and …

Privacy measurements in tabular synthetic data: State of the art and future research directions

ATP Boudewijn, AF Ferraris, D Panfilo… - … 2023 Workshop on …, 2023 - openreview.net
Synthetic data (SD) have garnered attention as a privacy enhancing technology.
Unfortunately, there is no standard for assessing their degree of privacy protection. In this …

Sarve: synthetic data and local differential privacy for private frequency estimation

G Varma, R Chauhan, D Singh - Cybersecurity, 2022 - Springer
The collection of user attributes by service providers is a double-edged sword. They are
instrumental in driving statistical analysis to train more accurate predictive models like …

Holdout-based empirical assessment of mixed-type synthetic data

M Platzer, T Reutterer - Frontiers in big Data, 2021 - frontiersin.org
AI-based data synthesis has seen rapid progress over the last several years and is
increasingly recognized for its promise to enable privacy-respecting high-fidelity data …

Generating synthetic training data for supervised de-identification of electronic health records

CA Libbi, J Trienes, D Trieschnigg, C Seifert - Future Internet, 2021 - mdpi.com
A major hurdle in the development of natural language processing (NLP) methods for
Electronic Health Records (EHRs) is the lack of large, annotated datasets. Privacy concerns …