Linear optimal transport embedding: provable Wasserstein classification for certain rigid transformations and perturbations

C Moosmüller, A Cloninger - … and Inference: A Journal of the …, 2023 - academic.oup.com
Discriminating between distributions is an important problem in a number of scientific fields.
This motivated the introduction of Linear Optimal Transportation (LOT), which embeds the …

Neural tangent kernel maximum mean discrepancy

X Cheng, Y Xie - Advances in Neural Information …, 2021 - proceedings.neurips.cc
We present a novel neural network Maximum Mean Discrepancy (MMD) statistic by
identifying a new connection between neural tangent kernel (NTK) and MMD. This …

Gaussian process landmarking on manifolds

T Gao, SZ Kovalsky, I Daubechies - SIAM Journal on Mathematics of Data …, 2019 - SIAM
As a means of improving analysis of biological shapes, we propose an algorithm for
sampling a Riemannian manifold by sequentially selecting points with maximum uncertainty …

Classification logit two-sample testing by neural networks for differentiating near manifold densities

X Cheng, A Cloninger - IEEE transactions on information theory, 2022 - ieeexplore.ieee.org
The recent success of generative adversarial networks and variational learning suggests
that training a classification network may work well in addressing the classical two-sample …

Supervised learning of sheared distributions using linearized optimal transport

V Khurana, H Kannan, A Cloninger… - Sampling Theory, Signal …, 2023 - Springer
In this paper we study supervised learning tasks on the space of probability measures. We
approach this problem by embedding the space of probability measures into L 2 spaces …

Convergence of graph Laplacian with kNN self-tuned kernels

X Cheng, HT Wu - Information and Inference: A Journal of the …, 2022 - academic.oup.com
Kernelized Gram matrix constructed from data points as is widely used in graph-based
geometric data analysis and unsupervised learning. An important question is how to choose …

A Review and Taxonomy of Methods for Quantifying Dataset Similarity

M Stolte, A Bommert, J Rahnenführer - arXiv preprint arXiv:2312.04078, 2023 - arxiv.org
In statistics and machine learning, measuring the similarity between two or more datasets is
important for several purposes. The performance of a predictive model on novel datasets …

Linear optimal transport embedding: Provable wasserstein classification for certain rigid transformations and perturbations

C Moosmüller, A Cloninger - arXiv preprint arXiv:2008.09165, 2020 - arxiv.org
Discriminating between distributions is an important problem in a number of scientific fields.
This motivated the introduction of Linear Optimal Transportation (LOT), which embeds the …

Kernel two-sample tests for manifold data

X Cheng, Y Xie - arXiv preprint arXiv:2105.03425, 2021 - arxiv.org
We present a study of a kernel-based two-sample test statistic related to the Maximum Mean
Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional …

Improved convergence rate of kNN graph Laplacians

Y Tan, X Cheng - arXiv preprint arXiv:2410.23212, 2024 - arxiv.org
In graph-based data analysis, $ k $-nearest neighbor ($ k $ NN) graphs are widely used due
to their adaptivity to local data densities. Allowing weighted edges in the graph, the …