Manifold learning: What, how, and why

M Meilă, H Zhang - Annual Review of Statistics and Its …, 2024 - annualreviews.org
Manifold learning (ML), also known as nonlinear dimension reduction, is a set of methods to
find the low-dimensional structure of data. Dimension reduction for large, high-dimensional …

Accelerated hierarchical density based clustering

L McInnes, J Healy - 2017 IEEE international conference on …, 2017 - ieeexplore.ieee.org
We present an accelerated algorithm for hierarchical density based clustering. Our new
algorithm improves upon HDBSCAN*, which itself provided a significant qualitative …

[图书][B] Frontiers in massive data analysis

National Research Council, Division on Engineering… - 2013 - books.google.com
Data mining of massive data sets is transforming the way we think about crisis response,
marketing, entertainment, cybersecurity and national intelligence. Collections of documents …

Maximum inner-product search using cone trees

P Ram, AG Gray - Proceedings of the 18th ACM SIGKDD international …, 2012 - dl.acm.org
The problem of efficiently finding the best match for a query in a given set with respect to the
Euclidean distance or the cosine similarity has been extensively studied. However, the …

Density estimation trees

P Ram, AG Gray - Proceedings of the 17th ACM SIGKDD international …, 2011 - dl.acm.org
In this paper we develop density estimation trees (DETs), the natural analog of classification
trees and regression trees, for the task of density estimation. We consider the estimation of a …

[图书][B] Advances in machine learning and data mining for astronomy

MJ Way, JD Scargle, KM Ali, AN Srivastava - 2012 - api.taylorfrancis.com
Advances in Machine Learning and Data Mining for Astronomy Page 1 W ay, Scargle, Chapman
& Hall/CRC Data Mining and Knowledge Discovery Series Advances in Machine Learning …

Fast Euclidean minimum spanning tree: algorithm, analysis, and applications

WB March, P Ram, AG Gray - Proceedings of the 16th ACM SIGKDD …, 2010 - dl.acm.org
The Euclidean Minimum Spanning Tree problem has applications in a wide range of fields,
and many efficient algorithms have been developed to solve it. We present a new, fast …

SpringerBriefs in Computer Science

S Zdonik, P Ning, S Shekhar, J Katz, X Wu, LC Jain… - 2012 - Springer
This is an introduction to multicast routing, which is the study of methods for routing from one
source to many destinations, or from many sources to many destinations. Multicast is …

Conditional t-SNE: more informative t-SNE embeddings

B Kang, D Garcia Garcia, J Lijffijt, R Santos-Rodríguez… - Machine Learning, 2021 - Springer
Dimensionality reduction and manifold learning methods such as t-distributed stochastic
neighbor embedding (t-SNE) are frequently used to map high-dimensional data into a two …

[PDF][PDF] Using the mutual k-nearest neighbor graphs for semi-supervised classification on natural language data

K Ozaki, M Shimbo, M Komachi… - Proceedings of the …, 2011 - aclanthology.org
The first step in graph-based semi-supervised classification is to construct a graph from input
data. While the k-nearest neighbor graphs have been the de facto standard method of graph …