作者
Aixin Sun, Ee-Peng Lim, Ying Liu
发表日期
2009/12/31
期刊
Decision Support Systems
卷号
48
期号
1
页码范围
191-201
出版商
North-Holland
简介
Many real-world text classification tasks involve imbalanced training examples. The strategies proposed to address the imbalanced classification (e.g., resampling, instance weighting), however, have not been systematically evaluated in the text domain. In this paper, we conduct a comparative study on the effectiveness of these strategies in the context of imbalanced text classification using Support Vector Machines (SVM) classifier. SVM is the interest in this study for its good classification accuracy reported in many text classification tasks. We propose a taxonomy to organize all proposed strategies following the training and the test phases in text classification tasks. Based on the taxonomy, we survey the methods proposed to address the imbalanced classification. Among them, 10 commonly-used methods were evaluated in our experiments on three benchmark datasets, i.e., Reuters-21578, 20-Newsgroups, and …
引用总数
200820092010201120122013201420152016201720182019202020212022202320242131812182821181920394131333117