Practical and optimal LSH for angular distance

A Andoni, P Indyk, T Laarhoven… - Advances in neural …, 2015 - proceedings.neurips.cc
We show the existence of a Locality-Sensitive Hashing (LSH) family for the angular distance
that yields an approximate Near Neighbor Search algorithm with the asymptotically optimal …

Hashing for similarity search: A survey

J Wang, HT Shen, J Song, J Ji - arXiv preprint arXiv:1408.2927, 2014 - arxiv.org
Similarity search (nearest neighbor search) is a problem of pursuing the data items whose
distances to a query item are the smallest from a large database. Various methods have …

A review for weighted minhash algorithms

W Wu, B Li, L Chen, J Gao… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Data similarity (or distance) computation is a fundamental research topic which underpins
many high-level applications based on similarity measures in machine learning and data …

A probabilistic model for multimodal hash function learning

Y Zhen, DY Yeung - Proceedings of the 18th ACM SIGKDD international …, 2012 - dl.acm.org
In recent years, both hashing-based similarity search and multimodal similarity search have
aroused much research interest in the data mining and other communities. While hashing …

Learning hash codes with listwise supervision

J Wang, W Liu, AX Sun, YG Jiang - Proceedings of the IEEE …, 2013 - cv-foundation.org
Hashing techniques have been intensively investigated in the design of highly efficient
search engines for largescale computer vision applications. Compared with prior …

Fast locality-sensitive hashing

A Dasgupta, R Kumar, T Sarlós - Proceedings of the 17th ACM SIGKDD …, 2011 - dl.acm.org
Locality-sensitive hashing (LSH) is a basic primitive in several large-scale data processing
applications, including nearest-neighbor search, de-duplication, clustering, etc. In this paper …

The power of comparative reasoning

J Yagnik, D Strelow, DA Ross… - … Conference on Computer …, 2011 - ieeexplore.ieee.org
Rank correlation measures are known for their resilience to perturbations in numeric values
and are widely used in many evaluation metrics. Such ordinal measures have rarely been …

Scalable similarity search with optimized kernel hashing

J He, W Liu, SF Chang - Proceedings of the 16th ACM SIGKDD …, 2010 - dl.acm.org
Scalable similarity search is the core of many large scale learning or data mining
applications. Recently, many research results demonstrate that one promising approach is …

Single cell RNA-seq data clustering using TF-IDF based methods

M Moussa, II Măndoiu - BMC genomics, 2018 - Springer
Background Single cell transcriptomics is critical for understanding cellular heterogeneity
and identification of novel cell types. Leveraging the recent advances in single cell RNA …

CROification: Accurate kernel classification with the efficiency of sparse linear SVM

M Kafai, K Eshghi - IEEE transactions on pattern analysis and …, 2017 - ieeexplore.ieee.org
Kernel methods have been shown to be effective for many machine learning tasks such as
classification and regression. In particular, support vector machines with the Gaussian …