A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arXiv preprint arXiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

Semeval-2022 task 11: Multilingual complex named entity recognition (multiconer)

S Malmasi, A Fang, B Fetahu, S Kar… - Proceedings of the …, 2022 - aclanthology.org
We present the findings of SemEval-2022 Task 11 on Multilingual Complex Named Entity
Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify …

Language models are few-shot multilingual learners

GI Winata, A Madotto, Z Lin, R Liu, J Yosinski… - arXiv preprint arXiv …, 2021 - arxiv.org
General-purpose language models have demonstrated impressive capabilities, performing
on par with state-of-the-art approaches on a range of downstream natural language …

JGLUE: Japanese general language understanding evaluation

K Kurihara, D Kawahara, T Shibata - Proceedings of the Thirteenth …, 2022 - aclanthology.org
To develop high-performance natural language understanding (NLU) models, it is
necessary to have a benchmark to evaluate and analyze NLU ability from various …

On the effect of pretraining corpora on in-context learning by a large-scale language model

S Shin, SW Lee, H Ahn, S Kim, HS Kim, B Kim… - arXiv preprint arXiv …, 2022 - arxiv.org
Many recent studies on large-scale language models have reported successful in-context
zero-and few-shot learning ability. However, the in-depth analysis of when in-context …

IGLUE: A benchmark for transfer learning across modalities, tasks, and languages

E Bugliarello, F Liu, J Pfeiffer, S Reddy… - International …, 2022 - proceedings.mlr.press
Reliable evaluation benchmarks designed for replicability and comprehensiveness have
driven progress in machine learning. Due to the lack of a multilingual benchmark, however …

End-to-end transformer-based models in textual-based NLP

A Rahali, MA Akhloufi - AI, 2023 - mdpi.com
Transformer architectures are highly expressive because they use self-attention
mechanisms to encode long-range dependencies in the input sequences. In this paper, we …

Bloom+ 1: Adding language support to bloom for zero-shot prompting

ZX Yong, H Schoelkopf, N Muennighoff, AF Aji… - arXiv preprint arXiv …, 2022 - arxiv.org
The BLOOM model is a large publicly available multilingual language model, but its
pretraining was limited to 46 languages. To extend the benefits of BLOOM to other …