Improved consistent weighted sampling revisited

D Probst, JL Reymond - Journal of Cheminformatics, 2020 - Springer

The chemical sciences are producing an unprecedented amount of large, high-dimensional
data sets containing chemical structures and associated properties. However, there are …

被引用次数：276 相关文章所有 19 个版本

[PDF] arxiv.org

A review for weighted minhash algorithms

W Wu, B Li, L Chen, J Gao… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Data similarity (or distance) computation is a fundamental research topic which underpins
many high-level applications based on similarity measures in machine learning and data …

被引用次数：45 相关文章所有 8 个版本

[PDF] oup.com

Evolution of biosequence search algorithms: a brief survey

G Kucherov - Bioinformatics, 2019 - academic.oup.com

Motivation Although modern high-throughput biomolecular technologies produce various
types of data, biosequence data remain at the core of bioinformatic analyses. However …

被引用次数：31 相关文章所有 9 个版本

[PDF] arxiv.org

Hashing-accelerated graph neural networks for link prediction

W Wu, B Li, C Luo, W Nejdl - Proceedings of the Web Conference 2021, 2021 - dl.acm.org

Networks are ubiquitous in the real world. Link prediction, as one of the key problems for
network-structured data, aims to predict whether there exists a link between two nodes. The …

被引用次数：48 相关文章所有 3 个版本

An LSH-based offloading method for IoMT services in integrated cloud-edge environment

X Xu, Q Huang, Y Zhang, S Li, L Qi, W Dou - ACM Transactions on …, 2021 - dl.acm.org

Benefiting from the massive available data provided by Internet of multimedia things (IoMT),
enormous intelligent services requiring information of various types to make decisions are …

被引用次数：34 相关文章

[PDF] hal.science

Geo-graph-indistinguishability: Protecting location privacy for LBS over road networks

S Takagi, Y Cao, Y Asano, M Yoshikawa - … , SC, USA, July 15–17, 2019 …, 2019 - Springer

Abstract In recent years, Geo-Indistinguishability (GeoI) has been increasingly explored for
protecting location privacy in location-based services (LBSs). GeoI is considered a …

被引用次数：42 相关文章所有 8 个版本

[PDF] mdpi.com

The duality of similarity and metric spaces

O Rozinek, J Mareš - Applied Sciences, 2021 - mdpi.com

We introduce a new mathematical basis for similarity space. For the first time, we describe
the relationship between distance and similarity from set theory. Then, we derive generally …

被引用次数：23 相关文章所有 8 个版本

[PDF] neurips.cc

Locality sensitive hashing in fourier frequency domain for soft set containment search

I Roy, R Agarwal, S Chakrabarti… - Advances in Neural …, 2023 - proceedings.neurips.cc

In many search applications related to passage retrieval, text entailment, and subgraph
search, the query and each'document'is a set of elements, with a document being relevant if …

被引用次数：2 相关文章所有 4 个版本

[PDF] acm.org

Weighted minwise hashing beats linear sketching for inner product estimation

A Bessa, M Daliri, J Freire, C Musco, C Musco… - Proceedings of the …, 2023 - dl.acm.org

We present a new approach for independently computing compact sketches that can be
used to approximate the inner product between pairs of high-dimensional vectors. Based on …

被引用次数：6 相关文章所有 6 个版本

[PDF] springer.com

Variance reduction in feature hashing using MLE and control variate method

BD Verma, R Pratap, M Thakur - Machine Learning, 2022 - Springer

The feature hashing algorithm introduced by Weinberger et al. is a popular dimensionality
reduction algorithm that compresses high dimensional data points into low dimensional data …

被引用次数：4 相关文章所有 4 个版本

高级搜索

QQ 群