M Aumüller, E Bernhardsson, A Faithfull - International conference on …, 2017 - Springer
This paper describes ANN-Benchmarks, a tool for evaluating the performance of in-memory approximate nearest neighbor algorithms. It provides a standard interface for measuring the …
DA Rachkovskij - Cybernetics and Systems Analysis, 2017 - Springer
This article reviews index structures for fast similarity search for objects represented by binary vectors (with components equal to 0 or 1). Structures for both exact and approximate …
The k-nearest-neighbors (k NN) graph is a popular and powerful data structure that is used in various areas of Data Science, but the high computational cost of obtaining it hinders its …
DA Rachkovskij - Cybernetics and Systems Analysis, 2018 - Springer
This survey paper considers index structures for fast similarity search for objects represented by real-valued vectors. Index structures based on locality-sensitive hashing and their …
Cardinality estimation is perhaps the simplest non-trivial statistical problem that can be solved via sketching. Industrially-deployed sketches like HyperLogLog, MinHash, and PCSA …
S Pettie, D Wang, L Yin - arXiv preprint arXiv:2008.08739, 2020 - researchgate.net
We study sketching schemes for the cardinality estimation problem in data streams, and advocate for measuring the efficiency of such a scheme in terms of its MVP: Memory …
Similarity Search: Algorithms for Sets and other High Dimensional Data Page 1 Similarity Search: Algorithms for Sets and other High Dimensional Data Thomas Dybdahl Ahle Advisor …