Combining oversampling and undersampling techniques for imbalanced classification: A comparative study using credit card fraudulent transaction dataset

H Shamsudin, UK Yusof, A Jayalakshmi… - 2020 IEEE 16th …, 2020 - ieeexplore.ieee.org
2020 IEEE 16th international conference on control & automation (ICCA), 2020ieeexplore.ieee.org
Credit card fraud is a serious and growing problem. It is becoming more challenging with an
issues of highly imbalanced class. In the field of data mining, prediction or typically known as
data classification problem involved detecting events. Uncommon events are hard to identify
on account of their inconsistency and casualness, however, misclassifying rare events can
result in heavy costs. Thus, to overcome this issues, it is suggested by few researches to
overcome at the stage of pre-processing itself. One of the pre-processing methods available …
Credit card fraud is a serious and growing problem. It is becoming more challenging with an issues of highly imbalanced class. In the field of data mining, prediction or typically known as data classification problem involved detecting events. Uncommon events are hard to identify on account of their inconsistency and casualness, however, misclassifying rare events can result in heavy costs. Thus, to overcome this issues, it is suggested by few researches to overcome at the stage of pre-processing itself. One of the pre-processing methods available is sampling methods. In sampling methods, oversampling and undersampling is the most widely used techniques in imbalanced data. This paper try to investigate the performance of classification model when combining the method of oversamplings and undersampling in detecting the fraud cases from the fraud detection dataset. Few oversampling techniques is selected to combine with random undersampling techniques to undergo the process and the performance of the model is evaluate using random forest classifier and performance measure selected that is suitable for imbalanced class problem. The results obtained then compared with previous literature. From the results, the combination of oversampling and undersampling techniques gives a better precision, recall and F1-Measure value in average of 0.80%.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果