L Hu, Z Liu, Z Zhao, L Hou, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Pre-trained Language Models (PLMs) which are trained on large text corpus via self- supervised learning method, have yielded promising performance on various tasks in …
Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large …
Language model (LM) pretraining can learn various knowledge from text corpora, helping downstream tasks. However, existing methods such as BERT model a single document, and …
Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin …
Motivation Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows. With the progress in natural language processing …
We introduce S2ORC, a large corpus of 81.1 M English-language academic papers spanning many academic disciplines. The corpus consists of rich metadata, paper abstracts …
V Yadav, S Bethard - arXiv preprint arXiv:1910.11470, 2019 - arxiv.org
Named Entity Recognition (NER) is a key component in NLP systems for question answering, information retrieval, relation extraction, etc. NER systems have been studied …
Natural language processing (NLP) is useful for handling text and speech, and sequence labeling plays an important role by automatically analyzing a sequence (text) to assign …