Design and evaluation of a parallel document clustering algorithm based on hierarchical latent semantic analysis

K Seshadri, KV Iyer - Concurrency and Computation: Practice …, 2019 - Wiley Online Library
Concurrency and Computation: Practice and Experience, 2019Wiley Online Library
We propose a parallel generalization scheme for Singular Value Decomposition–based
clustering algorithms. The scheme enables the clustering algorithm to generate a hierarchy
of clusters instead of a flat set of clusters. The generalization scheme infers the number of
levels to be formed and the number of clusters per level of the hierarchy automatically
without depending on any user‐supplied parameter. The performance of the suggested
hierarchical clustering algorithm was evaluated using the web directory taxonomy hosted by …
Summary
We propose a parallel generalization scheme for Singular Value Decomposition–based clustering algorithms. The scheme enables the clustering algorithm to generate a hierarchy of clusters instead of a flat set of clusters. The generalization scheme infers the number of levels to be formed and the number of clusters per level of the hierarchy automatically without depending on any user‐supplied parameter. The performance of the suggested hierarchical clustering algorithm was evaluated using the web directory taxonomy hosted by the Open Directory DMOZ. Empirical evaluations and statistical tests reveal that the proposed generalization scheme produces a superior cluster hierarchy when compared with two existing generalization techniques in terms of the precision, recall, f‐measure, and the rand index. The generalization scheme is well‐equipped to deal with large datasets and the speed‐up achieved by the parallelized generalization scheme over its sequential variant was measured using a multicore computer.
Wiley Online Library
以上显示的是最相近的搜索结果。 查看全部搜索结果