How I learned to stop worrying and love re-optimization

M Perron, Z Shang, T Kraska… - 2019 IEEE 35th …, 2019 - ieeexplore.ieee.org
Cost-based query optimizers remain one of the most important components of database
management systems for analytic workloads. Though modern optimizers select plans close …

Joins on samples: A theoretical guide for practitioners

D Huang, DY Yoon, S Pettie, B Mozafari - arXiv preprint arXiv:1912.03443, 2019 - arxiv.org
Despite decades of research on approximate query processing (AQP), our understanding of
sample-based joins has remained limited and, to some extent, even superficial. The …

Wander join and XDB: online aggregation via random walks

F Li, B Wu, K Yi, Z Zhao - ACM Transactions on Database Systems …, 2019 - dl.acm.org
Joins are expensive, and online aggregation over joins was proposed to mitigate the cost,
which offers users a nice and flexible tradeoff between query efficiency and accuracy in a …

Learning to sample: Counting with complex queries

B Walenz, S Sintos, S Roy, J Yang - arXiv preprint arXiv:1906.09335, 2019 - arxiv.org
We study the problem of efficiently estimating counts for queries involving complex filters,
such as user-defined functions, or predicates involving self-joins and correlated subqueries …

Experiences with approximating queries in Microsoft's production big-data clusters

S Kandula, K Lee, S Chaudhuri… - Proceedings of the VLDB …, 2019 - dl.acm.org
With the rapidly growing volume of data, it is more attractive than ever to leverage
approximations to answer analytic queries. Sampling is a powerful technique which has …

Learning models over relational data: A brief tutorial

M Schleich, D Olteanu, M Abo-Khamis, HQ Ngo… - … Conference, SUM 2019 …, 2019 - Springer
This tutorial overviews the state of the art in learning models over relational databases and
makes the case for a first-principles approach that exploits recent developments in database …

[HTML][HTML] Selective wander join: Fast progressive visualizations for data joins

M Procopio, C Scheidegger, E Wu, R Chang - Informatics, 2019 - mdpi.com
Progressive visualization offers a great deal of promise for big data visualization; however,
current progressive visualization systems do not allow for continuous interaction. What if …

On the performance and convergence of distributed stream processing via approximate fault tolerance

Z Cheng, Q Huang, PPC Lee - The VLDB Journal, 2019 - Springer
Fault tolerance is critical for distributed stream processing systems, yet achieving error-free
fault tolerance often incurs substantial performance overhead. We present AF-Stream, a …

大数据实时交互式分析

袁喆, 文继荣, 魏哲巍, 刘家俊, 姚斌, 郑凯 - 软件学报, 2019 - jos.org.cn
实时交互式分析针对多目标和多角度的分析任务, 通过多轮次的用户-数据库交互过程,
逐步明确分析任务与分析目标, 全方位地了解相关领域信息, 最终得到科学的, 全面的分析结果 …

In Good Company: Efficient Retrieval of the Top-k Most Relevant Event-Partner Pairs

D Wu, Y Zhu, CS Jensen - … , DASFAA 2019, Chiang Mai, Thailand, April 22 …, 2019 - Springer
The proliferation of event-based social networking (ESBN) motivates a range of studies on
topics such as event, venue, and friend recommendation and event creation and …