Scalable approximate query processing with the DBO engine

C Jermaine, S Arumugam, A Pol, A Dobra - ACM Transactions on …, 2008 - dl.acm.org
This article describes query processing in the DBO database system. Like other database
systems designed for ad hoc analytic processing, DBO is able to compute the exact answers …

Sketches for size of join estimation

F Rusu, A Dobra - ACM Transactions on Database Systems (TODS), 2008 - dl.acm.org
Sketching techniques provide approximate answers to aggregate queries both for data-
streaming and distributed computation. Small space summaries that have linearity …

A research agenda for query processing in large-scale peer data management systems

K Hose, A Roth, A Zeitz, KU Sattler, F Naumann - Information Systems, 2008 - Elsevier
Peer Data Management Systems (Pdmss) are a novel, useful, but challenging paradigm for
distributed data management and query processing. Conventional integrated information …

Materialized sample views for database approximation

S Joshi, C Jermaine - IEEE Transactions on Knowledge and …, 2008 - ieeexplore.ieee.org
We consider the problem of creating a sample view of a database table. A sample view is an
indexed materialized view that permits efficient sampling from an arbitrary range query over …

[PDF][PDF] Sampling algorithms for evolving datasets

R Gemulla - 2008 - researchgate.net
Perhaps the most flexible synopsis of a database is a uniform random sample of the data;
such samples are widely used to speed up the processing of analytic queries and data …

Confidence bounds for sampling-based group by estimates

F Xu, C Jermaine, A Dobra - ACM Transactions on Database Systems …, 2008 - dl.acm.org
Sampling is now a very important data management tool, to such an extent that an interface
for database sampling is included in the latest SQL standard. In this article we reconsider in …

The dbo database system

F Rusu, F Xu, LL Perez, M Wu, R Jampani… - Proceedings of the …, 2008 - dl.acm.org
We demonstrate our prototype of the DBO database system. DBO is designed to facilitate
scalable analytic processing over large data archives. DBO's analytic processing …

Maintaining very large random samples using the geometric file

A Pol, C Jermaine, S Arumugam - The VLDB Journal, 2008 - Springer
Random sampling is one of the most fundamental data management tools available.
However, most current research involving sampling considers the problem of how to use a …

Database aggregation query result estimator

S Chaudhuri, VR Narasayya, R Motwani… - US Patent …, 2008 - Google Patents
Aggregation queries are performed by first identifying outlier values, aggregating the outlier
values, and sampling the remaining data after pruning the outlier values. The sampled data …

Network-aware optimization in distributed data stream management systems

RB Kuntschke - 2008 - mediatum.ub.tum.de
Stream-based data management enables the efficient analysis and processing of large
volumes of data in distributed environments. This thesis presents network-aware …