Feature selection for high dimensional imbalanced class data using harmony search

A Moayedikia, KL Ong, YL Boo, WGS Yeoh… - … Applications of Artificial …, 2017 - Elsevier
Misclassification costs of minority class data in real-world applications can be very high. This
is a challenging problem especially when the data is also high in dimensionality because of …

Feature selection for high-dimensional and imbalanced biomedical data based on robust correlation based redundancy and binary grasshopper optimization …

G Abdulrauf Sharifai, Z Zainol - Genes, 2020 - mdpi.com
The training machine learning algorithm from an imbalanced data set is an inherently
challenging task. It becomes more demanding with limited samples but with a massive …

Very large-scale data classification based on K-means clustering and multi-kernel SVM

T Tang, S Chen, M Zhao, W Huang, J Luo - Soft Computing, 2019 - Springer
When classifying very large-scale data sets, there are two major challenges: the first
challenge is that it is time-consuming and laborious to label sufficient amount of training …

A new approach for instance selection: Algorithms, evaluation, and comparisons

M Malhat, M El Menshawy, H Mousa… - Expert Systems with …, 2020 - Elsevier
Several approaches for instance selection have been put forward as a primary step to
increase the efficiency and accuracy of algorithms applied to mine big data. The instance …

Optimal training and test sets design for machine learning

B Genc, H Tunc - Turkish Journal of Electrical Engineering …, 2019 - journals.tubitak.gov.tr
In this paper, we describe histogram matching, a metric for measuring the distance of two
datasets with exactly the same features, and embed it into a mixed integer programming …

Prediction of lymphedema occurrence in patients with breast cancer using the optimized combination of ensemble learning algorithm and feature selection

A Yaghoobi Notash, A Yaghoobi Notash… - BMC Medical Informatics …, 2022 - Springer
Background Breast cancer-related lymphedema is one of the most important complications
that adversely affect patients' quality of life. Lymphedema can be managed if its risk factors …

Instance selection and feature extraction using cuttlefish optimization algorithm and principal component analysis using decision tree

M Suganthi, V Karunakaran - Cluster Computing, 2019 - Springer
Instance selection and feature extraction is one of the most important task in data mining,
due to the huge amount of data is constantly being produced in many fields. If the dataset is …

A data reduction strategy and its application on scan and backscatter detection using rule-based classifiers

V Herrera-Semenets, OA Pérez-García… - Expert Systems with …, 2018 - Elsevier
In the last few years, the telecommunications scenario has experienced an increase in the
volume of information generated, as well as in the execution of malicious activities. In order …

Data dimensionality reduction (DDR) scheme for intrusion detection system using ensemble and standalone classifiers

A Bansal, S Kaur - Advances in Computing and Data Sciences: Third …, 2019 - Springer
The growth in IT sector is touching new pinnacles day by day, and hence the number of
devices that are connected through Internet have increased tremendously, resulting into Big …

Training set selection for monotonic ordinal classification

JR Cano, S García - Data & Knowledge Engineering, 2017 - Elsevier
In recent years, monotonic ordinal classification has increased the focus of attention for
machine learning community. Real life problems frequently have monotonicity constraints …