作者
S Gowtham, Mausumi Goswami, K Balachandran, Bipul Syam Purkayastha
发表日期
2014/8/27
研讨会论文
2014 Fourth International Conference on Advances in Computing and Communications
页码范围
162-166
出版商
IEEE
简介
The web mining is a cutting edge technology, which includes information gathering and classification of information over web. This paper puts forth the concepts of document pre-processing, which is achieved by extraction of keywords from the documents fetched from the web, processing it and generating a term-document matrix, TF-IDF and the different approaches of TF-IDF (term frequency Inverse document frequency) for each respective document. The last step is the clustering of these results through K Means algorithm, by comparing the performance of each approach used. The algorithm is realized on an X64 architecture and coded on Java and Matlab platform. The results are tabulated.
学术搜索中的文章
S Gowtham, M Goswami, K Balachandran… - 2014 Fourth International Conference on Advances in …, 2014