Density‐based clustering

RJGB Campello, P Kröger, J Sander… - … Reviews: Data Mining …, 2020 - Wiley Online Library
Clustering refers to the task of identifying groups or clusters in a data set. In density‐based
clustering, a cluster is a set of data objects spread in the data space over a contiguous …

Image classification with deep learning in the presence of noisy labels: A survey

G Algan, I Ulusoy - Knowledge-Based Systems, 2021 - Elsevier
Image classification systems recently made a giant leap with the advancement of deep
neural networks. However, these systems require an excessive amount of labeled data to be …

Dimensionality-driven learning with noisy labels

X Ma, Y Wang, ME Houle, S Zhou… - International …, 2018 - proceedings.mlr.press
Datasets with significant proportions of noisy (incorrect) class labels present challenges for
training accurate Deep Neural Networks (DNNs). We propose a new perspective for …

Fedcorr: Multi-stage federated learning for label noise correction

J Xu, Z Chen, TQS Quek… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Federated learning (FL) is a privacy-preserving distributed learning paradigm that enables
clients to jointly train a global model. In real-world FL implementations, client data could …

ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms

M Aumüller, E Bernhardsson, A Faithfull - Information Systems, 2020 - Elsevier
This paper describes ANN-Benchmarks, a tool for evaluating the performance of in-memory
approximate nearest neighbor algorithms. It provides a standard interface for measuring the …

Isotropy in the contextual embedding space: Clusters and manifolds

X Cai, J Huang, Y Bian, K Church - International conference on …, 2021 - openreview.net
The geometric properties of contextual embedding spaces for deep language models such
as BERT and ERNIE, have attracted considerable attention in recent years. Investigations on …

Indexing metric spaces for exact similarity search

L Chen, Y Gao, X Song, Z Li, Y Zhu, X Miao… - ACM Computing …, 2022 - dl.acm.org
With the continued digitization of societal processes, we are seeing an explosion in
available data. This is referred to as big data. In a research setting, three aspects of the data …

Estimating local intrinsic dimensionality

L Amsaleg, O Chelly, T Furon, S Girard… - Proceedings of the 21th …, 2015 - dl.acm.org
This paper is concerned with the estimation of a local measure of intrinsic dimensionality
(ID) recently proposed by Houle. The local model can be regarded as an extension of …

Local intrinsic dimensionality I: an extreme-value-theoretic foundation for similarity applications

ME Houle - Similarity Search and Applications: 10th International …, 2017 - Springer
Researchers have long considered the analysis of similarity applications in terms of the
intrinsic dimensionality (ID) of the data. This theory paper is concerned with a generalization …

Uniform convergence rates for kernel density estimation

H Jiang - International Conference on Machine Learning, 2017 - proceedings.mlr.press
Kernel density estimation (KDE) is a popular nonparametric density estimation method. We
(1) derive finite-sample high-probability density estimation bounds for multivariate KDE …