A survey on unsupervised outlier detection in high‐dimensional numerical data

A Zimek, E Schubert, HP Kriegel - Statistical Analysis and Data …, 2012 - Wiley Online Library
High‐dimensional data in Euclidean space pose special challenges to data mining
algorithms. These challenges are often indiscriminately subsumed under the term 'curse of …

Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

C Böhm, S Berchtold, DA Keim - ACM Computing Surveys (CSUR), 2001 - dl.acm.org
During the last decade, multimedia databases have become increasingly important in many
application areas such as medicine, CAD, geography, and molecular biology. An important …

Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

HP Kriegel, P Kröger, A Zimek - … on knowledge discovery from data (tkdd …, 2009 - dl.acm.org
As a prolific research area in data mining, subspace clustering and related problems
induced a vast quantity of proposed solutions. However, many publications compare a new …

iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

HV Jagadish, BC Ooi, KL Tan, C Yu… - ACM Transactions on …, 2005 - dl.acm.org
In this article, we present an efficient B+-tree based indexing method, called iDistance, for K-
nearest neighbor (KNN) search in a high-dimensional metric space. iDistance partitions the …

Can shared-neighbor distances defeat the curse of dimensionality?

ME Houle, HP Kriegel, P Kröger, E Schubert… - Scientific and Statistical …, 2010 - Springer
The performance of similarity measures for search, indexing, and data mining applications
tends to degrade rapidly as the dimensionality of the data increases. The effects of the so …

μ suite: a benchmark suite for microservices

A Sriraman, TF Wenisch - 2018 ieee international symposium …, 2018 - ieeexplore.ieee.org
Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to
instead comprise numerous, distributed microservices interacting via Remote Procedure …

The concentration of fractional distances

D François, V Wertz, M Verleysen - IEEE Transactions on …, 2007 - ieeexplore.ieee.org
Nearest neighbor search and many other numerical data analysis tools most often rely on
the use of the euclidean distance. When data are high dimensional, however, the euclidean …

[PDF][PDF] Indexing the distance: An efficient method to knn processing

C Yu, BC Ooi, KL Tan, HV Jagadish - Vldb, 2001 - vldb.org
In this paper, we present an efficient method, called iDistance, for K-nearest neighbor (KNN)
search in a high-dimensional space. iDistance partitions the data and selects a reference …

Quality and efficiency in high dimensional nearest neighbor search

Y Tao, K Yi, C Sheng, P Kalnis - Proceedings of the 2009 ACM SIGMOD …, 2009 - dl.acm.org
Nearest neighbor (NN) search in high dimensional space is an important problem in many
applications. Ideally, a practical solution (i) should be implementable in a relational …

On the" dimensionality curse" and the" self-similarity blessing"

F Korn, BU Pagel, C Faloutsos - IEEE Transactions on …, 2001 - ieeexplore.ieee.org
Spatial queries in high-dimensional spaces have been studied extensively. Among them,
nearest neighbor queries are important in many settings, including spatial databases (Find …