作者
Ammar Ismael Kadhim
发表日期
2019/4/2
研讨会论文
2019 international conference on advanced science and engineering (ICOASE)
页码范围
124-128
出版商
IEEE
简介
Feature extraction is to transform a text document from any format into a list of features that can be easily processed by text classification techniques. Feature extraction is one of significant preprocessing techniques in data mining and text classification that computes features value in documents. Hence, efficient feature extraction techniques like the BM25 and term frequency-inverse document frequency (TF-IDF) techniques are normally utilized in term weighting. Nevertheless, BM25 is not a single function that is utilized to exceedingly correct very long documents. This problem cannot denote the helpfulness or importance of confident features, and decreases the efficiency of classification. This paper presents a comparative study of feature extraction techniques. Two techniques were evaluated BM25 and TF-IDF to weight the terms on Twitter. In this paper, TF-IDF feature extraction technique is presented to compare …
引用总数
20202021202220232024132827379
学术搜索中的文章