The choice of scaling technique matters for classification performance

LBV de Amorim, GDC Cavalcanti, RMO Cruz - Applied Soft Computing, 2023 - Elsevier
Dataset scaling, also known as normalization, is an essential preprocessing step in a
machine learning pipeline. It is aimed at adjusting attributes scales in a way that they all vary …

An efficient data partitioning to improve classification performance while keeping parameters interpretable

K Korjus, MN Hebart, R Vicente - PloS one, 2016 - journals.plos.org
Supervised machine learning methods typically require splitting data into multiple chunks for
training, validating, and finally testing classifiers. For finding the best parameters of a …

The effects of class imbalance and training data size on classifier learning: an empirical study

W Zheng, M Jin - SN Computer Science, 2020 - Springer
This study discusses the effects of class imbalance and training data size on the predictive
performance of classifiers. An empirical study was performed on ten classifiers arising from …

Input decimation ensembles: Decorrelation through dimensionality reduction

NC Oza, K Tumer - International Workshop on Multiple Classifier Systems, 2001 - Springer
Using an ensemble of classifiers instead of a single classifier has been shown to improve
generalization performance in many machine learning problems [4, 16]. However, the extent …

Hyper-parameter optimization in classification: To-do or not-to-do

N Tran, JG Schneider, I Weber, AK Qin - Pattern Recognition, 2020 - Elsevier
Hyper-parameter optimization is a process to find suitable hyper-parameters for predictive
models. It typically incurs highly demanding computational costs due to the need of the time …

[PDF][PDF] The effect of class distribution on classifier learning: an empirical study

GM Weiss, F Provost - 2001 - storm.cis.fordham.edu
In this article we analyze the effect of class distribution on classifier learning. We begin by
describing the different ways in which class distribution affects learning and how it affects the …

On model evaluation under non-constant class imbalance

J Brabec, T Komárek, V Franc, L Machlica - Computational Science–ICCS …, 2020 - Springer
Many real-world classification problems are significantly class-imbalanced to detriment of
the class of interest. The standard set of proper evaluation metrics is well-known but the …

An experimental comparison of performance measures for classification

C Ferri, J Hernández-Orallo, R Modroiu - Pattern recognition letters, 2009 - Elsevier
Performance metrics in classification are fundamental in assessing the quality of learning
methods and learned models. However, many different measures have been defined in the …

Feature subset selection bias for classification learning

SK Singhi, H Liu - Proceedings of the 23rd international conference on …, 2006 - dl.acm.org
Feature selection is often applied to high-dimensional data prior to classification learning.
Using the same training dataset in both selection and learning can result in so-called feature …

Learning to classify with incremental new class

DW Zhou, Y Yang, DC Zhan - IEEE Transactions on Neural …, 2021 - ieeexplore.ieee.org
New class detection and effective model expansion are of great importance in incremental
data mining. In open incremental data environments, data often come with novel classes, eg …