Synthetic data generation for tabular health records: A systematic review

M Hernandez, G Epelde, A Alberdi, R Cilla, D Rankin - Neurocomputing, 2022 - Elsevier
Synthetic data generation (SDG) research has been ongoing for some time with promising
results in different application domains, including healthcare, biometrics and energy …

Method evaluation, parameterization, and result validation in unsupervised data mining: A critical survey

A Zimmermann - Wiley Interdisciplinary Reviews: Data Mining …, 2020 - Wiley Online Library
Abstract Machine Learning (ML) and Data Mining (DM) build tools intended to help users
solve data‐related problems that are infeasible for “unaugmented” humans. Tools need …

[HTML][HTML] Synthetic tabular data evaluation in the health domain covering resemblance, utility, and privacy dimensions

M Hernadez, G Epelde, A Alberdi… - … of Information in …, 2023 - thieme-connect.com
Background Synthetic tabular data generation is a potentially valuable technology with great
promise for data augmentation and privacy preservation. However, prior to adoption, an …

Exploring city digital twins as policy tools: A task-based approach to generating synthetic data on urban mobility

G Papyshev, M Yarime - Data & Policy, 2021 - cambridge.org
This article discusses the technology of city digital twins (CDTs) and its potential applications
in the policymaking context. The article analyzes the history of the development of the …

Standardised metrics and methods for synthetic tabular data evaluation

M Hernandez, G Epelde, A Alberdi, R Cilla… - Authorea …, 2023 - techrxiv.org
Synthetic Tabular Data Generation (STDG) is a potentially valuable technology with great
promise to augment real data and preserve privacy. However, prior to adoption, an empirical …

Gensyn: A multi-stage framework for generating synthetic microdata using macro data sources

A Acharya, S Sikdar, S Das… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Individual-level data (microdata) that characterizes a population, is essential for studying
many real-world problems. However, acquiring such data is not straightforward due to cost …

Machine learning methods for generating high dimensional discrete datasets

G Manco, E Ritacco, A Rullo, D Saccà… - … Reviews: Data Mining …, 2022 - Wiley Online Library
The development of platforms and techniques for emerging Big Data and Machine Learning
applications requires the availability of real‐life datasets. A possible solution is to synthesize …

[HTML][HTML] Making a few talk for the many–Modeling driver behavior using synthetic populations generated from experimental data

R Schindler, C Flannagan, A Bálint… - Accident Analysis & …, 2021 - Elsevier
Understanding driver behavior is the basis for the development of many advanced driver
assistance systems, and experimental studies are indispensable tools for constructing …

An economic feasibility assessment framework for underutilised crops using Support Vector Machine

MS Oh, ZY Chen, E Jahanshiri, D Isa… - Computers and electronics …, 2020 - Elsevier
As susceptibility of commercial crops to the changing climates and resulting harsher
conditions increases, interest in the potential of resilient underutilised crops grows …

The relaxed maximum entropy distribution and its application to pattern discovery

S Dalleiger, J Vreeken - 2020 IEEE International Conference …, 2020 - ieeexplore.ieee.org
The maximum entropy principle uniquely identifies the distribution that models our
knowledge about the data, but is otherwise maximally unbiased. As soon as we include non …