作者
Xiaoxing Wu, Hsin-Yao Wang, Peichang Shi, Rong Sun, Xiaolin Wang, Zhixiao Luo, Fanling Zeng, Michael S Lebowitz, Wan-Ying Lin, Jang-Jih Lu, Richard Scherer, Olivia Price, Ziwei Wang, Jiming Zhou, Yonghong Wang
发表日期
2022/5/1
期刊
Computers in Biology and Medicine
卷号
144
页码范围
105362
出版商
Pergamon
简介
Background
Machine learning (ML) has emerged as a superior method for the analysis of large datasets. Application of ML is often hindered by incompleteness of the data which is particularly evident when approaching disease screening data due to varied testing regimens across medical institutions. Here we explored the utility of multiple ML algorithms to predict cancer risk when trained using a large but incomplete real-world dataset of tumor marker (TM) values.
Methods
TM screening data were collected from a large asymptomatic cohort (n = 163,174) at two independent medical centers. The cohort included 785 individuals who were subsequently diagnosed with cancer. Data included levels of up to eight TMs, but for most subjects, only a subset of the biomarkers were tested. In some instances, TM values were available at multiple time points, but intervals between tests varied widely. The data were used to train …
引用总数
学术搜索中的文章