作者
Liwei Wang, Majid Rastegar-Mojarad, Ravikumar Komandur Elayavilli, Yanshan Wang, Hongfang Liu
发表日期
2018/6/4
研讨会论文
2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W)
页码范围
1-8
出版商
IEEE
简介
In the era of precision medicine, the clinical utility of next generation sequencing technology highly depends on the ability of interpreting the causality association of genetic variants and phenotyping which can be a labor intensive process. There are various resources available for cataloging such associations such as HGMD or ClinVar. Given the exponential growth in literature in the field, it is desired to accelerate the process by automatically identifying genetic causality statements from literature. Here, we define the task of identifying the statements as a classification task for sentences containing gene and disease entities. We used the cancer gene census available at the Catalogue of Somatic Mutations in Cancer (COSMIC) and to generate a weakly labeled data set for our classification task. We evaluated multiple feature sets such as: words, bi-grams, word embedding, and several machine-learning methods …
学术搜索中的文章
L Wang, M Rastegar-Mojarad, RK Elayavilli, Y Wang… - 2018 IEEE International Conference on Healthcare …, 2018