[HTML][HTML] Leakage and the reproducibility crisis in machine-learning-based science

S Kapoor, A Narayanan - Patterns, 2023 - cell.com
Machine-learning (ML) methods have gained prominence in the quantitative sciences.
However, there are many known methodological pitfalls, including data leakage, in ML …

REFORMS: Consensus-based Recommendations for Machine-learning-based Science

S Kapoor, EM Cantrell, K Peng, TH Pham, CA Bail… - Science …, 2024 - science.org
Machine learning (ML) methods are proliferating in scientific research. However, the
adoption of these methods has been accompanied by failures of validity, reproducibility, and …

Evaluation of a decided sample size in machine learning applications

D Rajput, WJ Wang, CC Chen - BMC bioinformatics, 2023 - Springer
Background An appropriate sample size is essential for obtaining a precise and reliable
outcome of a study. In machine learning (ML), studies with inadequate samples suffer from …

Avoiding common machine learning pitfalls

MA Lones - Patterns, 2024 - cell.com
Mistakes in machine learning practice are commonplace and can result in loss of confidence
in the findings and products of machine learning. This tutorial outlines common mistakes that …

How to avoid machine learning pitfalls: a guide for academic researchers

MA Lones - arXiv preprint arXiv:2108.02497, 2021 - arxiv.org
This document is a concise outline of some of the common mistakes that occur when using
machine learning, and what can be done to avoid them. Whilst it should be accessible to …

A heart disease prediction model based on feature optimization and smote-Xgboost algorithm

J Yang, J Guan - Information, 2022 - mdpi.com
In today's world, heart disease is the leading cause of death globally. Researchers have
proposed various methods aimed at improving the accuracy and efficiency of the clinical …

Machine learning models for data-driven prediction of diabetes by lifestyle type

Y Qin, J Wu, W Xiao, K Wang, A Huang, B Liu… - International journal of …, 2022 - mdpi.com
The prevalence of diabetes has been increasing in recent years, and previous research has
found that machine-learning models are good diabetes prediction tools. The purpose of this …

Reforms: Reporting standards for machine learning based science

S Kapoor, E Cantrell, K Peng, TH Pham, CA Bail… - arXiv preprint arXiv …, 2023 - arxiv.org
Machine learning (ML) methods are proliferating in scientific research. However, the
adoption of these methods has been accompanied by failures of validity, reproducibility, and …

Mental issues, internet addiction and quality of life predict burnout among Hungarian teachers: a machine learning analysis

G Feher, K Kapus, A Tibold, Z Banko, G Berke… - BMC Public Health, 2024 - Springer
Background Burnout is usually defined as a state of emotional, physical, and mental
exhaustion that affects people in various professions (eg physicians, nurses, teachers). The …

NeoAI 1.0: Machine learning-based paradigm for prediction of neonatal and infant risk of death

JS Teji, S Jain, SK Gupta, JS Suri - Computers in Biology and Medicine, 2022 - Elsevier
Abstract Background The Neonatal mortality rate in the United States is 3.8 deaths per 1000
live births, which is comparably higher than other nations. Purpose The aim of the proposed …