Classic query optimization techniques, including predicate pushdown, are of limited use for machine learning inference queries, because the user-defined functions (UDFs) which …
B Walenz, S Sintos, S Roy, J Yang - arXiv preprint arXiv:1906.09335, 2019 - arxiv.org
We study the problem of efficiently estimating counts for queries involving complex filters, such as user-defined functions, or predicates involving self-joins and correlated subqueries …
Predicate-centric rules for rewriting queries is a key technique in optimizing queries. These include pushing down the predicate below the join and aggregation operators, or optimizing …
Dataframes are a popular abstraction to represent, prepare, and analyze data. Despite the remarkable success of dataframe libraries in R and Python, dataframes face performance …
A widely used approach to characterize input data in both databases and ML is computing the correlation between attributes. The operation is supported by all major database engines …
B Walenz, J Yang - Proceedings of the VLDB Endowment, 2016 - dl.acm.org
We present a system, Perada, for parallel perturbation analysis of database queries. Perturbation analysis considers the results of a query evaluated with (a typically large …
The distributed nature of the cloud, regarding resource placement and application execution, incurs large data transfer. As data movement is unavoidable, it lies within the …
We will demonstrate a prototype query processing engine that uses probabilistic predicates (PPs) to speed up machine learning inference jobs. In current analytic engines, machine …
In this thesis, we propose EnrichDB, a new DBMS technology designed for emerging domains (eg, social media analytics and sensor-driven smart spaces) that require incoming …