Optimal hashing-based time-space trade-offs for approximate near neighbors

A Andoni, T Laarhoven, I Razenshteyn… - Proceedings of the twenty …, 2017 - SIAM
We show tight upper and lower bounds for time-space trade-offs for the c-approximate Near
Neighbor Search problem. For the d-dimensional Euclidean space and n-point datasets, we …

Index structures for fast similarity search for binary vectors

DA Rachkovskij - Cybernetics and Systems Analysis, 2017 - Springer
This article reviews index structures for fast similarity search for objects represented by
binary vectors (with components equal to 0 or 1). Structures for both exact and approximate …

A two-level signature scheme for stable set similarity joins

D Schmitt, D Kocher, N Augsten, W Mann… - Proceedings of the VLDB …, 2023 - dl.acm.org
We study the set similarity join problem, which retrieves all pairs of similar sets from two
collections of sets for a given distance function. Existing exact solutions employ a signature …

Binary vectors for fast distance and similarity estimation

DA Rachkovskij - Cybernetics and Systems Analysis, 2017 - Springer
This review considers methods and algorithms for fast estimation of distance/similarity
measures between initial data from vector representations with binary or integer-valued …

Optimal las vegas locality sensitive data structures

TD Ahle - 2017 IEEE 58th Annual Symposium on Foundations …, 2017 - ieeexplore.ieee.org
We show that approximate similarity (near neighbour) search can be solved in high
dimensions with performance matching state of the art (data independent) Locality Sensitive …

Combi: Compressed binary search tree for approximate k-nn searches in hamming space

P Gupta, A Jindal, D Sengupta - Big Data Research, 2021 - Elsevier
The space-partitioning based hashing techniques are widely used to represent high-
dimensional data points as bit-codes. Although Binary Search Trees (BSTs) can be used for …

Optimal Las Vegas Approximate Near Neighbors in p

A Wei - Proceedings of the Thirtieth Annual ACM-SIAM …, 2019 - SIAM
We show that approximate near neighbor search in high dimensions can be solved in a Las
Vegas fashion (ie, without false negatives) for ℓp (1≤ p≤ 2) while matching the …

Massively-parallel similarity join, edge-isoperimetry, and distance correlations on the hypercube

P Beame, C Rashtchian - Proceedings of the Twenty-Eighth Annual ACM …, 2017 - SIAM
We study distributed protocols for finding all pairs of similar vectors in a large dataset. Our
results pertain to a variety of discrete metrics, and we give concrete instantiations for …

Similarity Histogram Estimation Based Top-k Similarity Join Algorithm on High-Dimensional Data

Y Ma, R Zhang, Y Zhang - … Conference on Web Information Systems and …, 2019 - Springer
Top-k similarity join on high-dimensional data plays an important role in many applications.
The traditional tree-like index based approaches can't deal with large scale high …

Fast and Accurate Song Recognition: an Approach Based on Multi-Index Hashing

S Serrano, M Scarpa - 2022 International Conference on …, 2022 - ieeexplore.ieee.org
An activity of wide interest for researchers and companies working in the field of audio signal
processing is the capability to automatically recognize in real-time short excerpts of …