Photon: A fast query engine for lakehouse systems

A Behm, S Palkar, U Agarwal, T Armstrong… - Proceedings of the …, 2022 - dl.acm.org
Many organizations are shifting to a data management paradigm called the" Lakehouse,"
which implements the functionality of structured data warehouses on top of unstructured …

Graph IRS for impure higher-order languages: making aggressive optimizations affordable with precise effect dependencies

O Bračevac, G Wei, S Jia, S Abeysinghe… - Proceedings of the …, 2023 - dl.acm.org
Graph-based intermediate representations (IRs) are widely used for powerful compiler
optimizations, either interprocedurally in pure functional languages, or intraprocedurally in …

Quantifying TPC-H choke points and their optimizations

M Dreseler, M Boissier, T Rabl, M Uflacker - Proceedings of the VLDB …, 2020 - dl.acm.org
TPC-H continues to be the most widely used benchmark for relational OLAP systems. It
poses a number of challenges, also known as" choke points", which database systems have …

Babelfish: Efficient execution of polyglot queries

PM Grulich, S Zeuch, V Markl - Proceedings of the VLDB Endowment, 2021 - dl.acm.org
Today's users of data processing systems come from different domains, have different levels
of expertise, and prefer different programming languages. As a result, analytical workload …

Efficient execution of user-defined functions in SQL queries

Y Foufoulas, A Simitsis - Proceedings of the VLDB Endowment, 2023 - dl.acm.org
User-defined functions (UDFs) have been widely used to overcome the expressivity
limitations of SQL and complement its declarative nature with functional capabilities. UDFs …

Grizzly: Efficient stream processing through adaptive query compilation

PM Grulich, B Sebastian, S Zeuch, J Traub… - Proceedings of the …, 2020 - dl.acm.org
Stream Processing Engines (SPEs) execute long-running queries on unbounded data
streams. They follow an interpretation-based processing model and do not perform runtime …

Flan: an expressive and efficient datalog compiler for program analysis

S Abeysinghe, A Xhebraj, T Rompf - Proceedings of the ACM on …, 2024 - dl.acm.org
Datalog has gained prominence in program analysis due to its expressiveness and ease of
use. Its generic fixpoint resolution algorithm over relational domains simplifies the …

Tuplex: Data science in python at native code speed

L Spiegelberg, R Yesantharao, M Schwarzkopf… - Proceedings of the …, 2021 - dl.acm.org
Today's data science pipelines often rely on user-defined functions (UDFs) written in Python.
But interpreted Python code is slow, and Python UDFs cannot be compiled to machine code …

Architecting intermediate layers for efficient composition of data management and machine learning systems

S Abeysinghe, F Wang, G Essertel, T Rompf - arXiv preprint arXiv …, 2023 - arxiv.org
Modern data analytics workloads combine relational data processing with machine learning
(ML). Most DBMS handle these workloads by offloading these ML operations to external …

Backpropagation with callbacks: Foundations for efficient and expressive differentiable programming

F Wang, J Decker, X Wu, G Essertel… - Advances in Neural …, 2018 - proceedings.neurips.cc
Training of deep learning models depends on gradient descent and end-to-end
differentiation. Under the slogan of differentiable programming, there is an increasing …