作者
Göksel Biricik, Banu Diri, AC Sönmez
发表日期
2009/7
研讨会论文
DMIN
页码范围
481-485
简介
This paper introduces a new algorithm for dimensionality reduction and its application on web page classification. A heterogeneous collection of web pages is used as the dataset. Selected attributes for classification are the textual content of pages. Using the offered algorithm, high dimension of attributes-words extracted from the pages-are projected onto a new hyper plane having dimensions equal to the number of classes. Results show that processing times of classification algorithms dramatically decrease with the offered reduction algorithm. This mostly relies on the number of attributes given to classifiers fall off. Accuracies of the classification algorithms also increase compared to tests run without using the proposed reduction algorithm.
引用总数
201120122013201420152016201720182019202020212022202323121