作者
Rezarta Islamaj Doğan, Robert Leaman, Zhiyong Lu
发表日期
2014/2/1
期刊
Journal of biomedical informatics
卷号
47
页码范围
1-10
出版商
Academic Press
简介
Information encoded in natural language in biomedical literature publications is only useful if efficient and reliable ways of accessing and analyzing that information are available. Natural language processing and text mining tools are therefore essential for extracting valuable information, however, the development of powerful, highly effective tools to automatically detect central biomedical concepts such as diseases is conditional on the availability of annotated corpora.
This paper presents the disease name and concept annotations of the NCBI disease corpus, a collection of 793 PubMed abstracts fully annotated at the mention and concept level to serve as a research resource for the biomedical natural language processing community. Each PubMed abstract was manually annotated by two annotators with disease mentions and their corresponding concepts in Medical Subject Headings (MeSH®) or Online …
学术搜索中的文章
RI Doğan, R Leaman, Z Lu - Journal of biomedical informatics, 2014