作者
Ammar Ismael Kadhim, Yu-N Cheah, Inaam Abbas Hieder, Rawaa Ahmed Ali
发表日期
2017
研讨会论文
3rd International Engineering Conference on Developments in Civil & Computer Engineering Applications
页码范围
144-152
简介
Feature extraction is provided a lot of significance in social networks such as Twitter, due to playing a vital role in public opinion analysis. Several algorithms are suggested for solving them. Feature extractions are generally defined as to the process of extracting interesting features, non-trivial and knowledge from unstructured text documents. Feature extractions are interdisciplinary field which depends on information retrieval, machine learning, parameter statistics and computational linguistics. This study implements two methods term frequencyinverse document frequency (TF-IDF) and logarithm (TF-IDF) with singular value decomposition (SVD) dimensionality reduction techniques. The paper presents a new method that displays an effective preprocessing and dimensionality reduction techniques which help the feature extraction by using logarithm TF-IDF method. Finally, the experimental results show that logarithm TF-IDF method enhances the performance of English text document classification. Simulation results show the superiority of the proposed algorithm. In general, TF-IDF with logarithm outperforms traditional TF-IDF with respect to the evaluation metrics.
引用总数
201920202021202220232024314331
学术搜索中的文章
AI Kadhim, YN Cheah, IA Hieder, RA Ali - … engineering conference on developments in civil and …, 2017