Bloom: A 176b-parameter open-access multilingual language model

T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow… - 2023 - inria.hal.science
Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

C-pack: Packaged resources to advance general chinese embedding

S Xiao, Z Liu, P Zhang, N Muennighof - arXiv preprint arXiv:2309.07597, 2023 - arxiv.org
We introduce C-Pack, a package of resources that significantly advance the field of general
Chinese embeddings. C-Pack includes three critical resources. 1) C-MTEB is a …

Tabi: An efficient multi-level inference system for large language models

Y Wang, K Chen, H Tan, K Guo - Proceedings of the Eighteenth …, 2023 - dl.acm.org
Today's trend of building ever larger language models (LLMs), while pushing the
performance of natural language processing, adds significant latency to the inference stage …

Alexa teacher model: Pretraining and distilling multi-billion-parameter encoders for natural language understanding systems

J FitzGerald, S Ananthakrishnan, K Arkoudas… - Proceedings of the 28th …, 2022 - dl.acm.org
We present results from a large-scale experiment on pretraining encoders with non-
embedding parameter counts ranging from 700M to 9.3 B, their subsequent distillation into …

Unsupervised layer-wise score aggregation for textual ood detection

M Darrin, G Staerman, EDC Gomes… - Proceedings of the …, 2024 - ojs.aaai.org
Abstract Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness
and security requirements driven by an increased number of AI-based systems. Existing …

Privacy in the time of language models

C Peris, C Dupuy, J Majmudar, R Parikh… - Proceedings of the …, 2023 - dl.acm.org
Pretrained large language models (LLMs) have consistently shown state-of-the-art
performance across multiple natural language processing (NLP) tasks. These models are of …

Multi 3 WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems

S Hu, H Zhou, M Hergul, M Gritta, G Zhang… - Transactions of the …, 2023 - direct.mit.edu
Creating high-quality annotated data for task-oriented dialog (ToD) is known to be
notoriously difficult, and the challenges are amplified when the goal is to create equitable …

Towards leaving no indic language behind: Building monolingual corpora, benchmark and models for indic languages

S Doddapaneni, R Aralikatte, G Ramesh… - arXiv preprint arXiv …, 2022 - arxiv.org
Building Natural Language Understanding (NLU) capabilities for Indic languages, which
have a collective speaker base of more than one billion speakers is absolutely crucial. In this …

Cwcl: Cross-modal transfer with continuously weighted contrastive loss

RS Srinivasa, J Cho, C Yang… - Advances in …, 2023 - proceedings.neurips.cc
This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-
trained model in one modality is used for representation learning in another domain using …

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arXiv preprint arXiv …, 2024 - arxiv.org
Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …