Ensembles for feature selection: A review and future trends

V Bolón-Canedo, A Alonso-Betanzos - Information fusion, 2019 - Elsevier
Ensemble learning is a prolific field in Machine Learning since it is based on the assumption
that combining the output of multiple models is better than using a single model, and it …

[HTML][HTML] A comprehensive data level analysis for cancer diagnosis on imbalanced data

S Fotouhi, S Asadi, MW Kattan - Journal of biomedical informatics, 2019 - Elsevier
The early diagnosis of cancer, as one of the major causes of death, is vital for cancerous
patients. Diagnosing diseases in general and cancer in particular is a considerable …

A survey on data collection for machine learning: a big data-ai integration perspective

Y Roh, G Heo, SE Whang - IEEE Transactions on Knowledge …, 2019 - ieeexplore.ieee.org
Data collection is a major bottleneck in machine learning and an active research topic in
multiple communities. There are largely two reasons data collection has recently become a …

Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey

G Nguyen, S Dlugolinsky, M Bobák, V Tran… - Artificial Intelligence …, 2019 - Springer
The combined impact of new computing resources and techniques with an increasing
avalanche of large datasets, is transforming many research areas and may lead to …

A generalized mean distance-based k-nearest neighbor classifier

J Gou, H Ma, W Ou, S Zeng, Y Rao, H Yang - Expert Systems with …, 2019 - Elsevier
K-nearest neighbor (KNN) rule is a well-known non-parametric classifier that is widely used
in pattern recognition. However, the sensitivity of the neighborhood size k always seriously …

An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets

G Kovács - Applied Soft Computing, 2019 - Elsevier
Learning and mining from imbalanced datasets gained increased interest in recent years.
One simple but efficient way to increase the performance of standard machine learning …

Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE

G Douzas, F Bacao - Information sciences, 2019 - Elsevier
Classification of imbalanced datasets is a challenging task for standard algorithms. Although
many methods exist to address this problem in different ways, generating artificial data for …

Feature selection for imbalanced data based on neighborhood rough sets

H Chen, T Li, X Fan, C Luo - Information sciences, 2019 - Elsevier
Feature selection is a meaningful aspect of data mining that aims to select more relevant
data features and provide more concise and explicit data descriptions. It is beneficial for …

A random forests quantile classifier for class imbalanced data

R O'Brien, H Ishwaran - Pattern recognition, 2019 - Elsevier
Extending previous work on quantile classifiers (q-classifiers) we propose the q*-classifier
for the class imbalance problem. The classifier assigns a sample to the minority class if the …

Radial-based oversampling for noisy imbalanced data classification

M Koziarski, B Krawczyk, M Woźniak - Neurocomputing, 2019 - Elsevier
Imbalanced data classification remains a focus of intense research, mostly due to the
prevalence of data imbalance in various real-life application domains. A disproportion …