A review of tabular data synthesis using GANs on an IDS dataset

S Bourou, A El Saer, TH Velivassaki, A Voulkidis… - Information, 2021 - mdpi.com
Recent technological innovations along with the vast amount of available data worldwide
have led to the rise of cyberattacks against network systems. Intrusion Detection Systems …

[HTML][HTML] Challenges and opportunities of generative models on tabular data

AX Wang, SS Chukova, CR Simpson, BP Nguyen - Applied Soft Computing, 2024 - Elsevier
Tabular data, organized like tables with rows and columns, is widely used. Existing models
for tabular data synthesis often face limitations related to data size or complexity. In contrast …

Tabddpm: Modelling tabular data with diffusion models

A Kotelnikov, D Baranchuk… - International …, 2023 - proceedings.mlr.press
Denoising diffusion probabilistic models are becoming the leading generative modeling
paradigm for many important data modalities. Being the most prevalent in the computer …

Deep neural networks and tabular data: A survey

V Borisov, T Leemann, K Seßler, J Haug… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Heterogeneous tabular data are the most commonly used form of data and are essential for
numerous critical and computationally demanding applications. On homogeneous datasets …

Synthetic Data--what, why and how?

J Jordon, L Szpruch, F Houssiau, M Bottarelli… - arXiv preprint arXiv …, 2022 - arxiv.org
This explainer document aims to provide an overview of the current state of the rapidly
expanding work on synthetic data technologies, with a particular focus on privacy. The …

Codi: Co-evolving contrastive diffusion models for mixed-type tabular synthesis

C Lee, J Kim, N Park - International Conference on Machine …, 2023 - proceedings.mlr.press
With growing attention to tabular data these days, the attempt to apply a synthetic table to
various tasks has been expanded toward various scenarios. Owing to the recent advances …

A survey on gan techniques for data augmentation to address the imbalanced data issues in credit card fraud detection

E Strelcenia, S Prakoonwit - Machine Learning and Knowledge Extraction, 2023 - mdpi.com
Data augmentation is an important procedure in deep learning. GAN-based data
augmentation can be utilized in many domains. For instance, in the credit card fraud domain …

Ctab-gan+: Enhancing tabular data synthesis

Z Zhao, A Kunar, R Birke, H Van der Scheer… - Frontiers in big …, 2024 - frontiersin.org
The usage of synthetic data is gaining momentum in part due to the unavailability of original
data due to privacy and legal considerations and in part due to its utility as an augmentation …

TabMT: Generating tabular data with masked transformers

M Gulati, P Roysdon - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Abstract Autoregressive and Masked Transformers are incredibly effective as generative
models and classifiers. While these models are most prevalent in NLP, they also exhibit …

Realtabformer: Generating realistic relational and tabular data using transformers

AV Solatorio, O Dupriez - arXiv preprint arXiv:2302.02041, 2023 - arxiv.org
Tabular data is a common form of organizing data. Multiple models are available to generate
synthetic tabular datasets where observations are independent, but few have the ability to …