Communication-efficient jaccard similarity for high-performance distributed genome comparisons

M Besta, R Kanakagiri, H Mustafa… - 2020 IEEE …, 2020 - ieeexplore.ieee.org
The Jaccard similarity index is an important measure of the overlap of two sets, widely used
in machine learning, computational genomics, information retrieval, and many other areas …

Cheetah: Accelerating database queries with switch pruning

M Tirmazi, R Ben Basat, J Gao, M Yu - Proceedings of the 2020 ACM …, 2020 - dl.acm.org
Modern database systems are growing increasingly distributed and struggle to reduce query
completion time with a large volume of data. In this paper, we leverage programmable …

Adaptive distributed streaming similarity joins

G Siachamis, K Psarakis, M Fragkoulis… - Proceedings of the 17th …, 2023 - dl.acm.org
How can we perform similarity joins of multi-dimensional streams in a distributed fashion,
achieving low latency? Can we adaptively repartition those streams in order to retain high …

Massively Parallel Join Algorithms

X Hu, K Yi - ACM SIGMOD Record, 2020 - dl.acm.org
Due to the rapid development of massively parallel data processing systems such as
MapReduce and Spark, there have been revived interests in designing algorithms in a …

Parallel communication obliviousness: One round and beyond

Y Tao, R Wang, S Deng - Proceedings of the ACM on Management of …, 2024 - dl.acm.org
This paper studies communication-oblivious algorithms under the massively parallel
computation (MPC) model. The communication patterns of these algorithms follow a …

Cover or pack: New upper and lower bounds for massively parallel joins

X Hu - Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI …, 2021 - dl.acm.org
This paper considers the worst-case complexity of multi-round join evaluation in the
Massively Parallel Computation (MPC) model. Unlike the sequential RAM model, in which …

Parallel Acyclic Joins: Optimal Algorithms and Cyclicity Separation

X Hu, Y Tao - Journal of the ACM, 2024 - dl.acm.org
We study equi-join computation in the massively parallel computation (MPC) model.
Currently, a main open question under this topic is whether it is possible to design an …

A near-optimal parallel algorithm for joining binary relations

B Ketsman, D Suciu, Y Tao - Logical Methods in Computer …, 2022 - lmcs.episciences.org
We present a constant-round algorithm in the massively parallel computation (MPC) model
for evaluating a natural join where every input relation has two attributes. Our algorithm …

The complexity of Boolean conjunctive queries with intersection joins

M Abo Khamis, G Chichirim, A Kormpa… - Proceedings of the 41st …, 2022 - dl.acm.org
Intersection joins over interval data are relevant in spatial and temporal data settings. A set
of intervals join if their intersection is non-empty. In case of point intervals, the intersection …

[PDF][PDF] Enabling high-performance large-scale irregular computations

M Besta - 2021 - research-collection.ethz.ch
Computations on irregular graph structures are important for many fields, including social
sciences, bioinformatics, chemistry, medicine, cybersecurity, healthcare, web graph …