Given a scattering of observations on a map it is natural for one to want to determine the most likely origin of those points, and the origin is typically hidden within data. Using an …
Pseudo-centroid clustering replaces the traditional concept of a centroid expressed as a center of gravity with the notion of a pseudo-centroid (or a coordinate free centroid) which …
Z Sulc, J Cibulkova, H Rezankova - Computational Statistics, 2022 - Springer
In this paper, we present the second generation of the nomclust R package, which we developed for the hierarchical clustering of data containing nominal variables (nominal …
HH Bock - Data Analysis and Decision Support, 2005 - Springer
Abstract'symbolic Data Analysis'(SDA) provides tools for analyzing'symbolic'data, ie, data matrices X=(x kj) where the entries x kj are intervals, sets of categories, or frequency …
We design coresets for Ordered k-Median, a generalization of classical clustering problems such as k-Median and k-Center. Its objective function is defined via the Ordered Weighted …
L Chen, S Wang - Twenty-Third International Joint Conference on …, 2013 - Citeseer
The ability to cluster high-dimensional categorical data is essential for many machine learning applications such as bioinfomatics. Currently, central clustering of categorical data …
D Maynard, W Peters, Y Li - LREC, 2008 - pages.cs.brandeis.edu
In this paper, we discuss methods of measuring the performance of ontology-based information extraction systems. We focus particularly on the Balanced Distance Metric …
F Leisch - Computational statistics & data analysis, 2006 - Elsevier
A methodological and computational framework for centroid-based partitioning cluster analysis using arbitrary distance or similarity measures is presented. The power of high …
Despite recent efforts, the challenge in clustering categorical and mixed data in the context of big data still remains due to the lack of inherently meaningful measure of similarity …