Domain-specific language model pretraining for biomedical natural language processing

Y Gu, R Tinn, H Cheng, M Lucas, N Usuyama… - ACM Transactions on …, 2021 - dl.acm.org
Pretraining large neural language models, such as BERT, has led to impressive gains on
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …

Pretrained language models for biomedical and clinical tasks: understanding and extending the state-of-the-art

P Lewis, M Ott, J Du, V Stoyanov - Proceedings of the 3rd clinical …, 2020 - aclanthology.org
A large array of pretrained models are available to the biomedical NLP (BioNLP) community.
Finding the best model for a particular task can be difficult and time-consuming. For many …

Pretrained biomedical language models for clinical NLP in Spanish

CP Carrino, J Llop, M Pàmies… - Proceedings of the …, 2022 - aclanthology.org
This work presents the first large-scale biomedical Spanish language models trained from
scratch, using large biomedical corpora consisting of a total of 1.1 B tokens and an EHR …

Don't stop pretraining: Adapt language models to domains and tasks

S Gururangan, A Marasović, S Swayamdipta… - arXiv preprint arXiv …, 2020 - arxiv.org
Language models pretrained on text from a wide variety of sources form the foundation of
today's NLP. In light of the success of these broad-coverage models, we investigate whether …

[PDF][PDF] Enhancing clinical BERT embedding using a biomedical knowledge base

B Hao, H Zhu, IC Paschalidis - 28th international conference on …, 2020 - par.nsf.gov
Abstract Domain knowledge is important for building Natural Language Processing (NLP)
systems for low-resource settings, such as in the clinical domain. In this paper, a novel joint …

BioBART: Pretraining and evaluation of a biomedical generative language model

H Yuan, Z Yuan, R Gan, J Zhang, Y Xie… - arXiv preprint arXiv …, 2022 - arxiv.org
Pretrained language models have served as important backbones for natural language
processing. Recently, in-domain pretraining has been shown to benefit various domain …

Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets

Y Peng, S Yan, Z Lu - arXiv preprint arXiv:1906.05474, 2019 - arxiv.org
Inspired by the success of the General Language Understanding Evaluation benchmark, we
introduce the Biomedical Language Understanding Evaluation (BLUE) benchmark to …

Improving biomedical pretrained language models with knowledge

Z Yuan, Y Liu, C Tan, S Huang, F Huang - arXiv preprint arXiv:2104.10344, 2021 - arxiv.org
Pretrained language models have shown success in many natural language processing
tasks. Many works explore incorporating knowledge into language models. In the …

Keblm: Knowledge-enhanced biomedical language models

TM Lai, CX Zhai, H Ji - Journal of Biomedical Informatics, 2023 - Elsevier
Pretrained language models (PLMs) have demonstrated strong performance on many
natural language processing (NLP) tasks. Despite their great success, these PLMs are …

Pre-trained language models in biomedical domain: A systematic survey

B Wang, Q Xie, J Pei, Z Chen, P Tiwari, Z Li… - ACM Computing …, 2023 - dl.acm.org
Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …