Text mining with information-theoretic clustering

R Patil, S Boit, V Gudivada, J Nandigam - IEEE Access, 2023 - ieeexplore.ieee.org

Natural Language Processing (NLP) is a research field where a language in consideration
is processed to understand its syntactic, semantic, and sentimental aspects. The …

被引用次数：108 相关文章所有 2 个版本

[PDF] researchgate.net

A similarity measure for text classification and clustering

YS Lin, JY Jiang, SJ Lee - IEEE transactions on knowledge and …, 2013 - ieeexplore.ieee.org

Measuring the similarity between documents is an important operation in the text processing
field. In this paper, a new similarity measure is proposed. To compute the similarity between …

被引用次数：420 相关文章所有 8 个版本

[PDF] researchgate.net

[PDF][PDF] Comparing and combining dimension reduction techniques for efficient text clustering

B Tang, M Shepherd, E Milios… - Proceeding of SIAM …, 2005 - researchgate.net

A great challenge of text mining arises from the increasingly large text datasets and the high
dimensionality associated with natural language. In this research, a systematic study is …

被引用次数：120 相关文章所有 14 个版本

Low-complexity quantization of discrete memoryless channels

JA Zhang, BM Kurkoski - 2016 International Symposium on …, 2016 - ieeexplore.ieee.org

A quantizer design algorithm for discrete memory-less channels with non-binary inputs is
given, when the objective is to maximize the mutual information between the channel input …

被引用次数：49 相关文章所有 2 个版本

[PDF] brunel.ac.uk

A niching memetic algorithm for simultaneous clustering and feature selection

W Sheng, X Liu, M Fairhurst - IEEE Transactions on Knowledge …, 2008 - ieeexplore.ieee.org

Clustering is inherently a difficult task, and is made even more difficult when the selection of
relevant features is also an issue. In this paper we propose an approach for simultaneous …

被引用次数：86 相关文章所有 11 个版本

[PDF] ezcodesample.com

Document clustering using character N-grams: a comparative evaluation with term-based and word-based clustering

Y Miao, V Kešelj, E Milios - Proceedings of the 14th ACM international …, 2005 - dl.acm.org

We propose a novel method for document clustering using character N-grams. In the
traditional vector-space model, the documents are represented as vectors, in which each …

被引用次数：75 相关文章所有 12 个版本

Combining semantic and term frequency similarities for text clustering

VHA Soares, RJGB Campello… - … and Information Systems, 2019 - Springer

A key challenge for document clustering consists in finding a proper similarity measure for
text documents that enables the generation of cohesive groups. Measures based on the …

被引用次数：29 相关文章所有 7 个版本

[PDF] researchgate.net

A statistical model of cluster stability

Z Volkovich, Z Barzily, L Morozensky - Pattern Recognition, 2008 - Elsevier

In the current paper we present a method for assessing cluster stability. This method,
combined with a clustering algorithm, yields an estimate of the data partition, namely, the …

被引用次数：67 相关文章所有 9 个版本

[PDF] academia.edu

The method of N-grams in large-scale clustering of DNA texts

Z Volkovich, V Kirzhner, A Bolshoy, E Nevo, A Korol - Pattern recognition, 2005 - Elsevier

This paper is devoted to the techniques of clustering of texts based on the comparison of
vocabularies of N-grams. In contrast to the regular N-grams approach, the proposed N …

被引用次数：49 相关文章所有 11 个版本

A visual approach for interactive keyterm-based clustering

S Nourashrafeddin, E Sherkat, R Minghim… - ACM Transactions on …, 2018 - dl.acm.org

The keyterm-based approach is arguably intuitive for users to direct text-clustering
processes and adapt results to various applications in text analysis. Its way of markedly …

被引用次数：19 相关文章所有 3 个版本

高级搜索

QQ 群