On join sampling and the hardness of combinatorial output-sensitive join algorithms

S Deng, S Lu, Y Tao - Proceedings of the 42nd ACM SIGMOD-SIGACT …, 2023 - dl.acm.org
We present a dynamic index structure for join sampling. Built for an (equi-) join Q---let IN be
the total number of tuples in the input relations of Q---the structure uses~ O (IN) space …

Plexus: Optimizing Join Approximation for Geo-Distributed Data Analytics

J Wolfrath, A Chandra - Proceedings of the 2023 ACM Symposium on …, 2023 - dl.acm.org
Modern applications are increasingly generating and persisting data across geo-distributed
data centers or edge clusters rather than a single cloud. This paradigm introduces …

ShadowAQP: Efficient Approximate Group-by and Join Query via Attribute-Oriented Sample Size Allocation and Data Generation

R Gu, H Li, H Dai, W Huang, J Xue, M Li… - Proceedings of the …, 2023 - dl.acm.org
Approximate query processing (AQP) is one of the key techniques to cope with big data
querying problem on account that it obtains approximate answers efficiently. To address non …

Efficient Complex Aggregate Queries with Accuracy Guarantee Based on Execution Cost Model over Knowledge Graphs

S Ye, X Xu, Y Wang, T Fu - Mathematics, 2023 - mdpi.com
Knowledge graphs (KGs) have gained prominence for representing real-world facts, with
queries of KGs being crucial for their application. Aggregate queries, as one of the most …

A Step Toward Deep Online Aggregation (Extended Version)

N Sheoran, S Chockchowwat, A Chheda… - arXiv preprint arXiv …, 2023 - arxiv.org
For exploratory data analysis, it is often desirable to know what answers you are likely to get
before actually obtaining those answers. This can potentially be achieved by designing …

A step toward deep online aggregation

N Sheoran, S Chockchowwat, A Chheda… - Proceedings of the …, 2023 - dl.acm.org
For exploratory data analysis, it is often desirable to know what answers you are likely to get
before actually obtaining those answers. This can potentially be achieved by designing …

Learning-based Sample Tuning for Approximate Query Processing in Interactive Data Exploration

H Zhang, Y Jing, Z He, K Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
For interactive data exploration, approximate query processing (AQP) is a useful approach
that usually uses samples to provide a timely response for queries by trading query …

A hybrid prediction and search approach for flexible and efficient exploration of big data

J Li, Y Sun, Z Lei, S Chen, G Andrienko… - Journal of …, 2023 - Springer
This paper presents a hybrid prediction and search approach (HPS) for building
visualization systems of big data. The basic idea is training a regression model to predict a …

The complexity of aggregates over extractions by regular expressions

J Doleschal, B Kimelfeld… - Logical Methods in …, 2023 - lmcs.episciences.org
Regular expressions with capture variables, also known as “regex-formulas,” extract
relations of spans (intervals identified by their start and end indices) from text. In turn, the …

Random-Order Enumeration for Self-Reducible NP-Problems

P Chen, D Miao, W Tong, Z Guo, J Li, Z Cai - arXiv preprint arXiv …, 2023 - arxiv.org
In plenty of data analysis tasks, a basic and time-consuming process is to produce a large
number of solutions and feed them into downstream processing. Various enumeration …