查看文章

mdpi.com 中的 [HTML]

A hybrid machine learning approach to screen optimal predictors for the classification of primary breast tumors from gene expression microarray data

作者

Nashwan Alromema, Asif Hassan Syed, Tabrej Khan

发表日期

2023/2/13

期刊

Diagnostics

卷号

期号

页码范围

708

出版商

MDPI

简介

The high dimensionality and sparsity of the microarray gene expression data make it challenging to analyze and screen the optimal subset of genes as predictors of breast cancer (BC). The authors in the present study propose a novel hybrid Feature Selection (FS) sequential framework involving minimum Redundancy-Maximum Relevance (mRMR), a two-tailed unpaired t-test, and meta-heuristics to screen the most optimal set of gene biomarkers as predictors for BC. The proposed framework identified a set of three most optimal gene biomarkers, namely, MAPK 1, APOBEC3B, and ENAH. In addition, the state-of-the-art supervised Machine Learning (ML) algorithms, namely Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Neural Net (NN), Naïve Bayes (NB), Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), and Logistic Regression (LR) were used to test the predictive capability of the selected gene biomarkers and select the most effective breast cancer diagnostic model with higher values of performance matrices. Our study found that the XGBoost-based model was the superior performer with an accuracy of 0.976 ± 0.027, an F1-Score of 0.974 ± 0.030, and an AUC value of 0.961 ± 0.035 when tested on an independent test dataset. The screened gene biomarkers-based classification system efficiently detects primary breast tumors from normal breast samples.

引用总数

被引用次数：12

202320245 7

学术搜索中的文章

A hybrid machine learning approach to screen optimal predictors for the classification of primary breast tumors from gene expression microarray data

N Alromema, AH Syed, T Khan - Diagnostics, 2023

被引用次数：12 相关文章所有 8 个版本