Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arXiv preprint arXiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

[HTML][HTML] AMMU: a survey of transformer-based biomedical pretrained language models

KS Kalyan, A Rajasekharan, S Sangeetha - Journal of biomedical …, 2022 - Elsevier
Transformer-based pretrained language models (PLMs) have started a new era in modern
natural language processing (NLP). These models combine the power of transformers …

Byt5: Towards a token-free future with pre-trained byte-to-byte models

L Xue, A Barua, N Constant, R Al-Rfou… - Transactions of the …, 2022 - direct.mit.edu
Most widely used pre-trained language models operate on sequences of tokens
corresponding to word or subword units. By comparison, token-free models that operate …

Canine: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

JH Clark, D Garrette, I Turc, J Wieting - Transactions of the Association …, 2022 - direct.mit.edu
Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet
nearly all commonly used models still require an explicit tokenization step. While recent …

Charformer: Fast character transformers via gradient-based subword tokenization

Y Tay, VQ Tran, S Ruder, J Gupta, HW Chung… - arXiv preprint arXiv …, 2021 - arxiv.org
State-of-the-art models in natural language processing rely on separate rigid subword
tokenization algorithms, which limit their generalization ability and adaptation to new …

[HTML][HTML] Natural language processing applied to tourism research: A systematic review and future research directions

MÁ Álvarez-Carmona, R Aranda… - Journal of king Saud …, 2022 - Elsevier
The social networks and the rapid development of new technologies have led to
considerable changes in the tourism industry. Artificial intelligence, in particular natural …

CancerBERT: a cancer domain-specific language model for extracting breast cancer phenotypes from electronic health records

S Zhou, N Wang, L Wang, H Liu… - Journal of the American …, 2022 - academic.oup.com
Objective Accurate extraction of breast cancer patients' phenotypes is important for clinical
decision support and clinical research. This study developed and evaluated cancer domain …

Between words and characters: A brief history of open-vocabulary modeling and tokenization in NLP

SJ Mielke, Z Alyafeai, E Salesky, C Raffel… - arXiv preprint arXiv …, 2021 - arxiv.org
What are the units of text that we want to model? From bytes to multi-word expressions, text
can be analyzed and generated at many granularities. Until recently, most natural language …

A systematic review of transformer-based pre-trained language models through self-supervised learning

E Kotei, R Thirunavukarasu - Information, 2023 - mdpi.com
Transfer learning is a technique utilized in deep learning applications to transmit learned
inference to a different target domain. The approach is mainly to solve the problem of a few …

Dravidiancodemix: Sentiment analysis and offensive language identification dataset for dravidian languages in code-mixed text

BR Chakravarthi, R Priyadharshini… - Language Resources …, 2022 - Springer
This paper describes the development of a multilingual, manually annotated dataset for
three under-resourced Dravidian languages generated from social media comments. The …