Intrinsic dimension estimation: Relevant techniques and a benchmark framework

P Campadelli, E Casiraghi, C Ceruti… - Mathematical Problems …, 2015 - Wiley Online Library
When dealing with datasets comprising high‐dimensional points, it is usually advantageous
to discover some data structure. A fundamental information needed to this aim is the …

Scikit-dimension: a python package for intrinsic dimension estimation

J Bac, EM Mirkes, AN Gorban, I Tyukin, A Zinovyev - Entropy, 2021 - mdpi.com
Dealing with uncertainty in applications of machine learning to real-life data critically
depends on the knowledge of intrinsic dimensionality (ID). A number of methods have been …

Adaptive approximation and generalization of deep neural network with intrinsic dimensionality

R Nakada, M Imaizumi - Journal of Machine Learning Research, 2020 - jmlr.org
In this study, we prove that an intrinsic low dimensionality of covariates is the main factor that
determines the performance of deep neural networks (DNNs). DNNs generally provide …

Sparse manifold clustering and embedding

E Elhamifar, R Vidal - Advances in neural information …, 2011 - proceedings.neurips.cc
We propose an algorithm called Sparse Manifold Clustering and Embedding (SMCE) for
simultaneous clustering and dimensionality reduction of data lying in multiple nonlinear …

Your diffusion model secretly knows the dimension of the data manifold

J Stanczuk, G Batzolis, T Deveney… - arXiv preprint arXiv …, 2022 - arxiv.org
In this work, we propose a novel framework for estimating the dimension of the data manifold
using a trained diffusion model. A diffusion model approximates the score function ie the …

Pattern recognition in Latin America in the “Big Data” era

A Fernández, Á Gómez, F Lecumberry, Á Pardo… - Pattern Recognition, 2015 - Elsevier
Abstract The “Big Data” era has arisen, driven by the increasing availability of data from
multiple sources such as social media, online transactions, network sensors or mobile …

Multi-manifold semi-supervised learning

A Goldberg, X Zhu, A Singh, Z Xu… - Artificial intelligence …, 2009 - proceedings.mlr.press
We study semi-supervised learning when the data consists of multiple intersecting
manifolds. We give a finite sample analysis to quantify the potential gain of using unlabeled …

Spectral clustering on multiple manifolds

Y Wang, Y Jiang, Y Wu, ZH Zhou - IEEE Transactions on …, 2011 - ieeexplore.ieee.org
Spectral clustering (SC) is a large family of grouping methods that partition data using
eigenvectors of an affinity matrix derived from the data. Though SC methods have been …

Data skeletonization via Reeb graphs

X Ge, I Safa, M Belkin, Y Wang - Advances in neural …, 2011 - proceedings.neurips.cc
Recovering hidden structure from complex and noisy non-linear data is one of the most
fundamental problems in machine learning and statistical inference. While such data is often …

A survey of manifold learning for images

R Pless, R Souvenir - IPSJ Transactions on Computer Vision and …, 2009 - jstage.jst.go.jp
Many natural image sets are samples of a low-dimensional manifold in the space of all
possible images. Understanding this manifold is a key first step in understanding many sets …