A systematic review on imbalanced data challenges in machine learning: Applications and solutions

H Kaur, HS Pannu, AK Malhi - ACM computing surveys (CSUR), 2019 - dl.acm.org
In machine learning, the data imbalance imposes challenges to perform data analytics in
almost all areas of real-world research. The raw primary data often suffers from the skewed …

A survey on addressing high-class imbalance in big data

JL Leevy, TM Khoshgoftaar, RA Bauder, N Seliya - Journal of Big Data, 2018 - Springer
In a majority–minority classification problem, class imbalance in the dataset (s) can
dramatically skew the performance of classifiers, introducing a prediction bias for the …

Data imbalance in classification: Experimental evaluation

F Thabtah, S Hammoud, F Kamalov, A Gonsalves - Information Sciences, 2020 - Elsevier
Abstract The advent of Big Data has ushered a new era of scientific breakthroughs. One of
the common issues that affects raw data is class imbalance problem which refers to …

[HTML][HTML] The impact of class imbalance in classification performance metrics based on the binary confusion matrix

A Luque, A Carrasco, A Martín, A de Las Heras - Pattern Recognition, 2019 - Elsevier
A major issue in the classification of class imbalanced datasets involves the determination of
the most suitable performance metrics to be used. In previous work using several examples …

[图书][B] Learning from imbalanced data sets

Learning with imbalanced data refers to the scenario in which the amounts of instances that
represent the concepts in a given problem follow a different distribution. The main issue …

SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary

A Fernández, S Garcia, F Herrera, NV Chawla - Journal of artificial …, 2018 - jair.org
The Synthetic Minority Oversampling Technique (SMOTE) preprocessing algorithm is
considered" de facto" standard in the framework of learning from imbalanced data. This is …

Resampling imbalanced data for network intrusion detection datasets

S Bagui, K Li - Journal of Big Data, 2021 - Springer
Abstract Machine learning plays an increasingly significant role in the building of Network
Intrusion Detection Systems. However, machine learning models trained with imbalanced …

On the class overlap problem in imbalanced data classification

P Vuttipittayamongkol, E Elyan, A Petrovski - Knowledge-based systems, 2021 - Elsevier
Class imbalance is an active research area in the machine learning community. However,
existing and recent literature showed that class overlap had a higher negative impact on the …

Data-driven evolutionary optimization: An overview and case studies

Y Jin, H Wang, T Chugh, D Guo… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Most evolutionary optimization algorithms assume that the evaluation of the objective and
constraint functions is straightforward. In solving many real-world optimization problems …

Transforming big data into smart data: An insight on the use of the k‐nearest neighbors algorithm to obtain quality data

I Triguero, D García‐Gil, J Maillo… - … : Data Mining and …, 2019 - Wiley Online Library
The k‐nearest neighbors algorithm is characterized as a simple yet effective data mining
technique. The main drawback of this technique appears when massive amounts of data …