Design and evaluation of an rdma-aware data shuffling operator for parallel database systems

F Liu, L Yin, S Blanas - ACM Transactions on Database Systems (TODS), 2019 - dl.acm.org
The commoditization of high-performance networking has sparked research interest in the
RDMA capability of this hardware. One-sided RDMA primitives, in particular, have generated …

A practical approach to groupjoin and nested aggregates

P Fent, T Neumann - Proceedings of the VLDB Endowment, 2021 - dl.acm.org
Groupjoins, the combined execution of a join and a subsequent group by, are common in
analytical queries, and occur in about 1/8 of the queries in TPC-H and TPC-DS. While they …

[图书][B] Transaction processing on modern hardware

M Sadoghi, S Blanas - 2019 - books.google.com
The last decade has brought groundbreaking developments in transaction processing. This
resurgence of an otherwise mature research area has spurred from the diminishing cost per …

Beyond mpi: New communication interfaces for database systems and data-intensive applications

F Liu, C Barthels, S Blanas, H Kimura, G Swart - ACM SIGMOD Record, 2021 - dl.acm.org
Networkswith Remote DirectMemoryAccess (RDMA) support are becoming increasingly
common. RDMA, however, offers a limited programming interface to remote memory that …

Distributed numerical and machine learning computations via two-phase execution of aggregated join trees

D Jankov, B Yuan, S Luo, C Jermaine - Proceedings of the VLDB …, 2021 - par.nsf.gov
When numerical and machine learning (ML) computations are expressed relationally,
classical query execution strategies (hash-based joins and aggregations) can do a poor job …

Practical planning and execution of groupjoin and nested aggregates

P Fent, A Birler, T Neumann - The VLDB Journal, 2023 - Springer
Groupjoins combine execution of a join and a subsequent group-by. They are common in
analytical queries and occur in about of the queries in TPC-H and TPC-DS. While they were …

Topology-aware parallel data processing: Models, algorithms and systems at scale

S Blanas, P Koutris, A Sidiropoulos - 10th Annual Conference on …, 2020 - par.nsf.gov
The analysis of massive datasets requires a large number of processors. Prior research has
largely assumed that tracking the actual data distribution and the underlying network …

Algorithms for a topology-aware massively parallel computation model

X Hu, P Koutris, S Blanas - Proceedings of the 40th ACM SIGMOD …, 2021 - dl.acm.org
Most of the prior work in massively parallel data processing assumes homogeneity, ie, every
computing unit has the same computational capability and can communicate with every …

Handling data skew for aggregation in spark SQL using task stealing

Z He, Q Huang, Z Li, C Weng - International Journal of Parallel …, 2020 - Springer
In distributed in-memory computing systems, data distribution has a large impact on
performance. Designing a good partition algorithm is difficult and requires users to have …

Low Latency Query Planning and Processing in Database Systems

P Fent - 2024 - mediatum.ub.tum.de
Efficient data processing is one of the core techniques that enables modern data driven
computer systems. Database systems are uniquely positioned to use increasing hardware …