Combining oversampling and undersampling techniques for imbalanced classification: A comparative study using credit card fraudulent transaction dataset

H Shamsudin, UK Yusof, A Jayalakshmi… - 2020 IEEE 16th …, 2020 - ieeexplore.ieee.org
Credit card fraud is a serious and growing problem. It is becoming more challenging with an
issues of highly imbalanced class. In the field of data mining, prediction or typically known as …

Improving healthcare access management by predicting patient no-show behaviour

DB Ferro, S Brailsford, C Bravo, H Smith - Decision Support Systems, 2020 - Elsevier
Low attendance levels in medical appointments have been associated with poor health
outcomes and efficiency problems for service providers. To address this problem, healthcare …

Acoustic and language analysis of speech for suicidal ideation among US veterans

A Belouali, S Gupta, V Sourirajan, J Yu, N Allen… - BioData mining, 2021 - Springer
Background Screening for suicidal ideation in high-risk groups such as US veterans is
crucial for early detection and suicide prevention. Currently, screening is based on clinical …

A novel enhanced decision tree model for detecting chronic kidney disease

AK Chaudhuri, D Sinha, DK Banerjee, A Das - Network Modeling Analysis …, 2021 - Springer
Prediction of diseases is sensitive as any error can result in the wrong person's treatment or
not treating the right patient. Besides, some features distinguish a disease from curable to …

An authoritative approach to citation classification

D Pride, P Knoth - Proceedings of the ACM/IEEE Joint Conference on …, 2020 - dl.acm.org
The ability to understand not only that a piece of research has been cited, but why it has
been cited has wide-ranging applications in the areas of research evaluation, in tracking the …

Realistic preterm prediction based on optimized synthetic sampling of EHG signal

J Xu, Z Chen, J Zhang, Y Lu, X Yang, A Pumir - Computers in Biology and …, 2021 - Elsevier
Preterm labor is the leading cause of neonatal morbidity and mortality in newborns and has
attracted significant research attention from many scientific areas. The relationship between …

Resampling imbalanced network intrusion datasets to identify rare attacks

S Bagui, D Mink, S Bagui, S Subramaniam, D Wallace - Future internet, 2023 - mdpi.com
This study, focusing on identifying rare attacks in imbalanced network intrusion datasets,
explored the effect of using different ratios of oversampled to undersampled data for binary …

Multi-channel electrohysterography enabled uterine contraction characterization and its effect in delivery assessment

J Shen, Y Liu, M Zhang, A Pumir, L Mu, B Li… - Computers in Biology and …, 2023 - Elsevier
Uterine contractions are routinely monitored by tocodynamometer (TOCO) at late stage of
pregnancy to predict the onset of labor. However, TOCO reveals no information on the …

An Improved Random Forest Model for Detecting Heart Disease

AK Chaudhuri, S Das, A Ray - Data-Centric AI Solutions and …, 2024 - taylorfrancis.com
Diagnosing cardiovascular disease (CVD) is a crucial issue in healthcare and research on
machine learning. Machine-learning techniques can predict risk at an early stage of CVD …

Sampling bias due to near-duplicates in learning to rank

M Fröbe, J Bevendorff, JH Reimer, M Potthast… - Proceedings of the 43rd …, 2020 - dl.acm.org
Learning to rank~(LTR) is the de facto standard for web search, improving upon classical
retrieval models by exploiting (in) direct relevance feedback from user judgments, interaction …