查看文章

Semantic based features selection and weighting method for text classification

作者

Aurangzeb Khan, Baharum Baharudin, Khairullah Khan

发表日期

2010/6/15

研讨会论文

2010 international symposium on information technology

卷号

页码范围

850-855

出版商

IEEE

简介

Feature selection and weighting is of vital concern in text classification process which improves the efficiency and accuracy of text classifier. Vector Space Model is used to represent the documents using “Bag of Word” BOW model with term weighting phenomena. Documents representation through this model has some limitations that are, ignoring term dependencies, structure and ordering of the terms in documents. To overcome this problem, Semantics Base Feature Vector using Part of Speech (POS), is proposed, which is used to extract the concept of terms using WordNet, co-occurring and associated terms. The proposed method is applied on small documents dataset which shows that this method outperforms then term frequency/ inverse document frequency (TF-IDF) with BOW feature selection method for text classification.

引用总数

被引用次数：16

201320142015201620172018201920202021202220232 2 2 1 2 1 1 1 2 1

学术搜索中的文章

Semantic based features selection and weighting method for text classification

A Khan, B Baharudin, K Khan - 2010 international symposium on information …, 2010

被引用次数：16 相关文章所有 2 个版本