I/O-efficient similarity join

X Yuan, X Wang, C Wang, C Yu… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org

Similarity search on high-dimensional data has been intensively studied for data processing
and analytics. Despite its broad applicability, data security and privacy concerns along the …

被引用次数：42 相关文章所有 6 个版本

[PDF] arxiv.org

Scalable and robust set similarity join

T Christiani, R Pagh, J Sivertsen - 2018 IEEE 34th international …, 2018 - ieeexplore.ieee.org

Set similarity join is a fundamental and well-studied database operator. It is usually studied
in the exact setting where the goal is to compute all pairs of sets that exceed a given …

被引用次数：39 相关文章所有 10 个版本

[PDF] arxiv.org

On the complexity of inner product similarity join

TD Ahle, R Pagh, I Razenshteyn… - Proceedings of the 35th …, 2016 - dl.acm.org

A number of tasks in classification, information retrieval, recommendation systems, and
record linkage reduce to the core problem of inner product similarity join (IPS join) …

被引用次数：48 相关文章所有 9 个版本

[PDF] uwaterloo.ca

Output-optimal massively parallel algorithms for similarity joins

X Hu, K Yi, Y Tao - ACM Transactions on Database Systems (TODS), 2019 - dl.acm.org

Parallel join algorithms have received much attention in recent years due to the rapid
development of massively parallel systems such as MapReduce and Spark. In the database …

被引用次数：30 相关文章所有 7 个版本

[PDF] googleapis.com

Systems and methods for privacy-assured similarity joins over encrypted datasets

C Wang, S Nutanong, X Yuan, X Wang… - US Patent 10,496,638, 2019 - Google Patents

Abstract Systems and methods which provide secure queries with respect to encrypted
datasets are described. Embodiments provide privacy-assured similarity join techniques …

被引用次数：31 相关文章所有 4 个版本

[PDF] hkust.edu.hk

Output-optimal parallel algorithms for similarity joins

X Hu, Y Tao, K Yi - Proceedings of the 36th ACM SIGMOD-SIGACT …, 2017 - dl.acm.org

Parallel join algorithms have received much attention in recent years, due to the rapid
development of massively parallel systems such as MapReduce and Spark. In the database …

被引用次数：26 相关文章所有 10 个版本

[PDF] researchgate.net

R Pagh, N Pham, F Silvestri, M Stöckel - Algorithmica, 2017 - Springer

We present an I/O-efficient algorithm for computing similarity joins based on locality-
sensitive hashing (LSH). In contrast to the filtering methods commonly suggested our …

被引用次数：16 相关文章所有 6 个版本

[PDF] arxiv.org

Adaptive mapreduce similarity joins

S McCauley, F Silvestri - Proceedings of the 5th ACM SIGMOD …, 2018 - dl.acm.org

Similarity joins are a fundamental database operation. Given data sets S and R, the goal of a
similarity join is to find all points x∈ S and y∈ R with distance at most r. Recent research …

被引用次数：12 相关文章所有 10 个版本

[PDF] arxiv.org

Set similarity search for skewed data

S McCauley, JW Mikkelsen, R Pagh - … of the 37th ACM SIGMOD-SIGACT …, 2018 - dl.acm.org

Set similarity join, as well as the corresponding indexing problem set similarity search, are
fundamental primitives for managing noisy or uncertain data. For example, these primitives …

被引用次数：9 相关文章所有 6 个版本

[PDF] siam.org

Massively-parallel similarity join, edge-isoperimetry, and distance correlations on the hypercube

P Beame, C Rashtchian - Proceedings of the Twenty-Eighth Annual ACM …, 2017 - SIAM

We study distributed protocols for finding all pairs of similar vectors in a large dataset. Our
results pertain to a variety of discrete metrics, and we give concrete instantiations for …

被引用次数：7 相关文章所有 6 个版本

高级搜索

QQ 群