The properties of high-dimensional data spaces: implications for exploring gene and protein expression data

R Clarke, HW Ressom, A Wang, J Xuan, MC Liu… - Nature reviews …, 2008 - nature.com
High-throughput genomic and proteomic technologies are widely used in cancer research to
build better predictive models of diagnosis, prognosis and therapy, to identify and …

The lernaean hydra of data series similarity search: An experimental evaluation of the state of the art

K Echihabi, K Zoumpatianos, T Palpanas… - arXiv preprint arXiv …, 2020 - arxiv.org
Increasingly large data series collections are becoming commonplace across many different
domains and applications. A key operation in the analysis of data series collections is …

Zero-shot learning with semantic output codes

M Palatucci, D Pomerleau… - Advances in neural …, 2009 - proceedings.neurips.cc
We consider the problem of zero-shot learning, where the goal is to learn a classifier $ f:
X\rightarrow Y $ that must predict novel values of $ Y $ that were omitted from the training …

[图书][B] Similarity search: the metric space approach

P Zezula, G Amato, V Dohnal, M Batko - 2006 - books.google.com
The area of similarity searching is a very hot topic for both research and c-mercial
applications. Current data processing applications use data with c-siderably less structure …

The concentration of fractional distances

D François, V Wertz, M Verleysen - IEEE Transactions on …, 2007 - ieeexplore.ieee.org
Nearest neighbor search and many other numerical data analysis tools most often rely on
the use of the euclidean distance. When data are high dimensional, however, the euclidean …

A compact and efficient image retrieval approach based on border/interior pixel classification

RO Stehling, MA Nascimento, AX Falcão - Proceedings of the eleventh …, 2002 - dl.acm.org
This paper presents\bic (B order/I nterior pixel C lassification), a compact and efficient CBIR
approach suitable for broad image domains. It has three main components:(1) a simple and …

Quality and efficiency in high dimensional nearest neighbor search

Y Tao, K Yi, C Sheng, P Kalnis - Proceedings of the 2009 ACM SIGMOD …, 2009 - dl.acm.org
Nearest neighbor (NN) search in high dimensional space is an important problem in many
applications. Ideally, a practical solution (i) should be implementable in a relational …

Content-based copy retrieval using distortion-based probabilistic similarity search

A Joly, O Buisson, C Frelicot - ieee Transactions on Multimedia, 2007 - ieeexplore.ieee.org
Content-based copy retrieval (CBCR) aims at retrieving in a database all the modified
versions or the previous versions of a given candidate object. In this paper, we present a …

Maximum inner-product search using cone trees

P Ram, AG Gray - Proceedings of the 18th ACM SIGKDD international …, 2012 - dl.acm.org
The problem of efficiently finding the best match for a query in a given set with respect to the
Euclidean distance or the cosine similarity has been extensively studied. However, the …

Return of the lernaean hydra: Experimental evaluation of data series approximate similarity search

K Echihabi, K Zoumpatianos, T Palpanas… - arXiv preprint arXiv …, 2020 - arxiv.org
Data series are a special type of multidimensional data present in numerous domains,
where similarity search is a key operation that has been extensively studied in the data …