查看文章

The prevalence of errors in machine learning experiments

作者

Martin Shepperd, Yuchen Guo, Ning Li, Mahir Arzoky, Andrea Capiluppi, Steve Counsell, Giuseppe Destefanis, Stephen Swift, Allan Tucker, Leila Yousefi

发表日期

2019

研讨会论文

Intelligent Data Engineering and Automated Learning–IDEAL 2019: 20th International Conference, Manchester, UK, November 14–16, 2019, Proceedings, Part I 20

页码范围

102-109

出版商

Springer International Publishing

简介

Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments.

Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction. Our focus is simple arithmetical and statistical errors.

Method: We analyse 49 papers describing 2456 individual experimental results from a previously undertaken systematic review comparing supervised and unsupervised defect prediction classifiers. We extract the confusion matrices and test for relevant constraints, e.g., the marginal probabilities must sum to one. We also check for multiple statistical significance testing errors.

Results: We find that a total of 22 out of 49 papers contain demonstrable errors. Of these 7 were statistical …

引用总数

被引用次数：10

202020212022202320243 3 2 2

学术搜索中的文章

The prevalence of errors in machine learning experiments

M Shepperd, Y Guo, N Li, M Arzoky, A Capiluppi… - Intelligent Data Engineering and Automated Learning …, 2019

被引用次数：10 相关文章所有 8 个版本