Learning from imbalanced data

H He, EA Garcia - IEEE Transactions on knowledge and data …, 2009 - ieeexplore.ieee.org
With the continuous expansion of data availability in many large-scale, complex, and
networked systems, such as surveillance, security, Internet, and finance, it becomes critical …

Data mining for imbalanced datasets: An overview

NV Chawla - Data mining and knowledge discovery handbook, 2010 - Springer
A dataset is imbalanced if the classification categories are not approximately equally
represented. Recent years brought increased interest in applying machine learning …

The impact of class rebalancing techniques on the performance and interpretation of defect prediction models

C Tantithamthavorn, AE Hassan… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Defect models that are trained on class imbalanced datasets (ie, the proportion of defective
and clean modules is not equally represented) are highly susceptible to produce inaccurate …

Generative adversarial network for fault detection diagnosis of chillers

K Yan, A Chong, Y Mo - Building and Environment, 2020 - Elsevier
Automatic fault detection and diagnosis (AFDD) for chillers has significant impacts on energy
saving, indoor environment comfort and systematic building management. Recent works …

Training and assessing classification rules with imbalanced data

G Menardi, N Torelli - Data mining and knowledge discovery, 2014 - Springer
The problem of modeling binary responses by using cross-sectional data has been
addressed with a number of satisfying solutions that draw on both parametric and …

ADASYN: Adaptive synthetic sampling approach for imbalanced learning

H He, Y Bai, EA Garcia, S Li - 2008 IEEE international joint …, 2008 - ieeexplore.ieee.org
This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning
from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution …

A study of the behavior of several methods for balancing machine learning training data

GE Batista, RC Prati, MC Monard - ACM SIGKDD explorations newsletter, 2004 - dl.acm.org
There are several aspects that might influence the performance achieved by existing
learning systems. It has been reported that one of these aspects is related to class …

[HTML][HTML] 10 m crop type mapping using Sentinel-2 reflectance and 30 m cropland data layer product

KH Tran, HK Zhang, JT McMaine, X Zhang… - International Journal of …, 2022 - Elsevier
The 30 m resolution US Department of Agriculture (USDA) crop data layer (CDL) is a widely
used crop type map for agricultural management and assessment, environmental impact …

Special issue on learning from imbalanced data sets

NV Chawla, N Japkowicz, A Kotcz - ACM SIGKDD explorations …, 2004 - dl.acm.org
The class imbalance problem is one of the (relatively) new problems that emerged when
machine learning matured from an embryonic science to an applied technology, amply used …

Mining with rarity: a unifying framework

GM Weiss - ACM Sigkdd Explorations Newsletter, 2004 - dl.acm.org
Rare objects are often of great interest and great value. Until recently, however, rarity has
not received much attention in the context of data mining. Now, as increasingly complex real …