Semantically-grounded construction of centroids for datasets with textual attributes

S Martı, A Valls, D Sánchez - Knowledge-Based Systems, 2012 - Elsevier
Centroids are key components in many data analysis algorithms such as clustering or
microaggregation. They are considered as the central value that minimises the distance to …

Semantics of point spaces through the Topological Weighted Centroid and other mathematical quantities: Theory and applications

M Buscema, M Breda, E Grossi, L Catzola… - Data Mining Applications …, 2012 - Springer
Given a scattering of observations on a map it is natural for one to want to determine the
most likely origin of those points, and the origin is typically hidden within data. Using an …

Pseudo-centroid clustering

F Glover - Soft Computing, 2017 - Springer
Pseudo-centroid clustering replaces the traditional concept of a centroid expressed as a
center of gravity with the notion of a pseudo-centroid (or a coordinate free centroid) which …

Nomclust 2.0: an R package for hierarchical clustering of objects characterized by nominal variables

Z Sulc, J Cibulkova, H Rezankova - Computational Statistics, 2022 - Springer
In this paper, we present the second generation of the nomclust R package, which we
developed for the hierarchical clustering of data containing nominal variables (nominal …

Optimization in symbolic data analysis: dissimilarities, class centers, and clustering

HH Bock - Data Analysis and Decision Support, 2005 - Springer
Abstract'symbolic Data Analysis'(SDA) provides tools for analyzing'symbolic'data, ie, data
matrices X=(x kj) where the entries x kj are intervals, sets of categories, or frequency …

Coresets for ordered weighted clustering

V Braverman, SHC Jiang… - … on Machine Learning, 2019 - proceedings.mlr.press
We design coresets for Ordered k-Median, a generalization of classical clustering problems
such as k-Median and k-Center. Its objective function is defined via the Ordered Weighted …

[PDF][PDF] Central clustering of categorical data with automated feature weighting

L Chen, S Wang - Twenty-Third International Joint Conference on …, 2013 - Citeseer
The ability to cluster high-dimensional categorical data is essential for many machine
learning applications such as bioinfomatics. Currently, central clustering of categorical data …

[PDF][PDF] Evaluating Evaluation Metrics for Ontology-Based Applications: Infinite Reflection.

D Maynard, W Peters, Y Li - LREC, 2008 - pages.cs.brandeis.edu
In this paper, we discuss methods of measuring the performance of ontology-based
information extraction systems. We focus particularly on the Balanced Distance Metric …

A toolbox for k-centroids cluster analysis

F Leisch - Computational statistics & data analysis, 2006 - Elsevier
A methodological and computational framework for centroid-based partitioning cluster
analysis using arbitrary distance or similarity measures is presented. The power of high …

A method for k-means-like clustering of categorical data

THT Nguyen, DT Dinh, S Sriboonchitta… - Journal of Ambient …, 2023 - Springer
Despite recent efforts, the challenge in clustering categorical and mixed data in the context
of big data still remains due to the lack of inherently meaningful measure of similarity …