A survey on deep learning for named entity recognition

J Li, A Sun, J Han, C Li - IEEE transactions on knowledge and …, 2020 - ieeexplore.ieee.org
Named entity recognition (NER) is the task to identify mentions of rigid designators from text
belonging to predefined semantic types such as person, location, organization etc. NER …

Attention in natural language processing

A Galassi, M Lippi, P Torroni - IEEE transactions on neural …, 2020 - ieeexplore.ieee.org
Attention is an increasingly popular mechanism used in a wide range of neural
architectures. The mechanism itself has been realized in a variety of formats. However …

Self-instruct: Aligning language models with self-generated instructions

Y Wang, Y Kordi, S Mishra, A Liu, NA Smith… - arXiv preprint arXiv …, 2022 - arxiv.org
Large" instruction-tuned" language models (ie, finetuned to respond to instructions) have
demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they …

Diffusion-lm improves controllable text generation

X Li, J Thickstun, I Gulrajani… - Advances in Neural …, 2022 - proceedings.neurips.cc
Controlling the behavior of language models (LMs) without re-training is a major open
problem in natural language generation. While recent works have demonstrated successes …

Flaubert: Unsupervised language model pre-training for french

H Le, L Vial, J Frej, V Segonne, M Coavoux… - arXiv preprint arXiv …, 2019 - arxiv.org
Language models have become a key step to achieve state-of-the art results in many
different Natural Language Processing (NLP) tasks. Leveraging the huge amount of …

Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks

K Zhang, J Yu, E Adhikarla, R Zhou, Z Yan… - arXiv e …, 2023 - ui.adsabs.harvard.edu
Conventional task-and modality-specific artificial intelligence (AI) models are inflexible in
real-world deployment and maintenance for biomedicine. At the same time, the growing …

Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models

M Jeong, J Sohn, M Sung, J Kang - Bioinformatics, 2024 - academic.oup.com
Recent proprietary large language models (LLMs), such as GPT-4, have achieved a
milestone in tackling diverse challenges in the biomedical domain, ranging from multiple …

FEQA: A question answering evaluation framework for faithfulness assessment in abstractive summarization

E Durmus, H He, M Diab - arXiv preprint arXiv:2005.03754, 2020 - arxiv.org
Neural abstractive summarization models are prone to generate content inconsistent with
the source document, ie unfaithful. Existing automatic metrics do not capture such mistakes …

[PDF][PDF] Improving language understanding by generative pre-training

A Radford - 2018 - hayate-lab.com
Natural language understanding comprises a wide range of diverse tasks such as textual
entailment, question answering, semantic similarity assessment, and document …

An empirical evaluation of generic convolutional and recurrent networks for sequence modeling

S Bai, JZ Kolter, V Koltun - arXiv preprint arXiv:1803.01271, 2018 - arxiv.org
For most deep learning practitioners, sequence modeling is synonymous with recurrent
networks. Yet recent results indicate that convolutional architectures can outperform …