A systematic review on imbalanced data challenges in machine learning: Applications and solutions

H Kaur, HS Pannu, AK Malhi - ACM computing surveys (CSUR), 2019 - dl.acm.org
In machine learning, the data imbalance imposes challenges to perform data analytics in
almost all areas of real-world research. The raw primary data often suffers from the skewed …

Uncertainty in big data analytics: survey, opportunities, and challenges

RH Hariri, EM Fredericks, KM Bowers - Journal of Big data, 2019 - Springer
Big data analytics has gained wide attention from both academia and industry as the
demand for understanding trends in massive datasets increases. Recent developments in …

A survey on addressing high-class imbalance in big data

JL Leevy, TM Khoshgoftaar, RA Bauder, N Seliya - Journal of Big Data, 2018 - Springer
In a majority–minority classification problem, class imbalance in the dataset (s) can
dramatically skew the performance of classifiers, introducing a prediction bias for the …

Learning from class-imbalanced data: Review of methods and applications

G Haixiang, L Yijing, J Shang, G Mingyun… - Expert systems with …, 2017 - Elsevier
Rare events, especially those that could potentially negatively impact society, often require
humans' decision-making responses. Detecting rare events can be viewed as a prediction …

Under-sampling class imbalanced datasets by combining clustering analysis and instance selection

CF Tsai, WC Lin, YH Hu, GT Yao - Information Sciences, 2019 - Elsevier
Class-imbalanced datasets, ie, those with the number of data samples in one class being
much larger than that in another class, occur in many real-world problems. Using these …

Neighbourhood-based undersampling approach for handling imbalanced and overlapped data

P Vuttipittayamongkol, E Elyan - Information Sciences, 2020 - Elsevier
Class imbalanced datasets are common across different domains including health, security,
banking and others. A typical supervised learning algorithm tends to be biased towards the …

Survey of review spam detection using machine learning techniques

M Crawford, TM Khoshgoftaar, JD Prusa, AN Richter… - Journal of Big Data, 2015 - Springer
Online reviews are often the primary factor in a customer's decision to purchase a product or
service, and are a valuable source of information that can be used to determine public …

Fuzzy rule based unsupervised sentiment analysis from social media posts

S Vashishtha, S Susan - Expert Systems with Applications, 2019 - Elsevier
In this paper, we compute the sentiment of social media posts using a novel set of fuzzy
rules involving multiple lexicons and datasets. The proposed fuzzy system integrates Natural …

[PDF][PDF] Classification with class imbalance problem

A Ali, SM Shamsuddin, AL Ralescu - Int. J. Advance Soft Compu …, 2013 - researchgate.net
Most existing classification approaches assume the underlying training set is evenly
distributed. In class imbalanced classification, the training set for one class (majority) far …

Towards felicitous decision making: An overview on challenges and trends of Big Data

H Wang, Z Xu, H Fujita, S Liu - Information Sciences, 2016 - Elsevier
Abstract The era of Big Data has arrived along with large volume, complex and growing data
generated by many distinct sources. Nowadays, nearly every aspect of the modern society is …