Using the stability of objects to determine the number of clusters in datasets

E Lord, M Willems, FJ Lapointe, V Makarenkov - Information Sciences, 2017 - Elsevier
Information Sciences, 2017Elsevier
We introduce a novel method for assessing the robustness of clusters found by partitioning
algorithms. First, we show how the stability of individual objects can be estimated based on
repeated runs of the K-means and K-medoids algorithms. The quality of the resulting
clusterings, expressed by the popular Calinski–Harabasz, Silhouette, Dunn and Davies–
Bouldin cluster validity indices, is taken into account when computing the stability estimates
of individual objects. Second, we explain how to assess the stability of individual clusters of …
Abstract
We introduce a novel method for assessing the robustness of clusters found by partitioning algorithms. First, we show how the stability of individual objects can be estimated based on repeated runs of the K-means and K-medoids algorithms. The quality of the resulting clusterings, expressed by the popular Calinski–Harabasz, Silhouette, Dunn and Davies–Bouldin cluster validity indices, is taken into account when computing the stability estimates of individual objects. Second, we explain how to assess the stability of individual clusters of objects and sets of clusters that are found by partitioning algorithms. Finally, we present a new and effective stability-based algorithm that improves the ability of traditional partitioning methods to determine the number of clusters in datasets. We compare our algorithm to some well-known cluster identification techniques, including X-means, Pvclust, Adegenet, Prediction Strength and Nselectboot. Our experiments with synthetic and benchmark data demonstrate the effectiveness of the proposed algorithm in different practical situations. The R package ClusterStability has been developed to provide applied researchers with new stability estimation tools presented in this paper. It is freely distributed through the Comprehensive R Archive Network (CRAN) and available at: https://cran.r-project.org/web/packages/ClusterStability.
Elsevier
以上显示的是最相近的搜索结果。 查看全部搜索结果