Dscaler: Synthetically scaling a given relational database

JW Zhang, YC Tay - Proceedings of the VLDB Endowment, 2016 - dl.acm.org
The Dataset Scaling Problem (DSP) defined in previous work states: Given an empirical set
of relational tables D and a scale factor s, generate a database state D that is similar to D but …

Enhanced Regular Expression as a DGL for Generation of Synthetic Big Data.

K Cheng, K Abe - Journal of Information Processing …, 2023 - search.ebscohost.com
Synthetic data generation is generally used in performance evaluation and function tests in
data-intensive applications, as well as in various areas of data analytics, such as privacy …

Synthetically Scaling an Empirical Dataset

Z Jiangwei - 2018 - search.proquest.com
Large-scale enterprises, like Amazon and Douban, have enormous datasets. For research
and development, it is impractical to run experiments with such a large dataset. It is therefore …

A Regular Expression-based DGL for Meaningful Synthetic Data Generation

K Cheng - 2020 IEEE International Conference on Big Data …, 2020 - ieeexplore.ieee.org
Synthetic datasets are necessary for performance evaluation and function test in most
database applications. In this paper, we propose a regular expression-based data …

A Collaborative Framework for Similarity Enforcement in Synthetic Scaling of Relational Datasets

JW Zhang, YC Tay - 2019 IEEE 35th International Conference …, 2019 - ieeexplore.ieee.org
Researchers and developers use benchmarks to compare their algorithms and products. A
database benchmark must have a dataset. To be application-specific, this dataset should be …

A collaborative framework for tweaking properties in a synthetic dataset

JW Zhang, Y Wang, YC Tay - Proceedings of the VLDB Endowment, 2018 - dl.acm.org
Researchers and developers use benchmarks to compare their algorithms and products. For
database systems, a benchmark must have a dataset D. To be application-specific, this …

[PDF][PDF] Efficient Algorithms to Compute Hierarchical Summaries from Big Data Streams

Z Shah - 2017 - unsworks.unsw.edu.au
Many data stream applications have hierarchical data; containing time, geographic
locations, product information, clickstreams, server logs, IP addresses. A hierarchical …