related:tdMdDLHheT0J:scholar.google.com/

Massive: A 1m-example multilingual natural language understanding dataset with 51 typologically-diverse languages

J FitzGerald, C Hench, C Peris, S Mackie… - arXiv preprint arXiv …, 2022 - arxiv.org

We present the MASSIVE dataset--Multilingual Amazon Slu resource package (SLURP) for
Slot-filling, Intent classification, and Virtual assistant Evaluation. MASSIVE contains 1M …

被引用次数：66 相关文章所有 8 个版本

[PDF] arxiv.org

Megaverse: Benchmarking large language models across languages, modalities, models and tasks

S Ahuja, D Aggarwal, V Gumma, I Watts… - arXiv preprint arXiv …, 2023 - arxiv.org

Recently, there has been a rapid advancement in research on Large Language Models
(LLMs), resulting in significant progress in several Natural Language Processing (NLP) …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Baichuan 2: Open large-scale language models

A Yang, B Xiao, B Wang, B Zhang, C Bian… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have demonstrated remarkable performance on a variety of
natural language tasks based on just a few examples of natural language instructions …

被引用次数：191 相关文章所有 2 个版本

[PDF] acm.org

Alexa teacher model: Pretraining and distilling multi-billion-parameter encoders for natural language understanding systems

J FitzGerald, S Ananthakrishnan, K Arkoudas… - Proceedings of the 28th …, 2022 - dl.acm.org

We present results from a large-scale experiment on pretraining encoders with non-
embedding parameter counts ranging from 700M to 9.3 B, their subsequent distillation into …

被引用次数：64 相关文章所有 7 个版本

[PDF] arxiv.org

Culturax: A cleaned, enormous, and multilingual dataset for large language models in 167 languages

T Nguyen, C Van Nguyen, VD Lai, H Man… - arXiv preprint arXiv …, 2023 - arxiv.org

The driving factors behind the development of large language models (LLMs) with
impressive learning capabilities are their colossal model sizes and extensive training …

被引用次数：26 相关文章所有 3 个版本

[PDF] arxiv.org

Alignbench: Benchmarking chinese alignment of large language models

X Liu, X Lei, S Wang, Y Huang, Z Feng, B Wen… - arXiv preprint arXiv …, 2023 - arxiv.org

Alignment has become a critical step for instruction-tuned Large Language Models (LLMs)
to become helpful assistants. However, effective evaluation of alignment for emerging …

被引用次数：13 相关文章所有 2 个版本

[PDF] arxiv.org

LINGUIST: Language model instruction tuning to generate annotated utterances for intent classification and slot tagging

A Rosenbaum, S Soltan, W Hamza, Y Versley… - arXiv preprint arXiv …, 2022 - arxiv.org

We present LINGUIST, a method for generating annotated data for Intent Classification and
Slot Tagging (IC+ ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual …

被引用次数：27 相关文章所有 4 个版本

[PDF] arxiv.org

Flm-101b: An open llm and how to train it with $100 k budget

X Li, Y Yao, X Jiang, X Fang, X Meng, S Fan… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have achieved remarkable success in NLP and multimodal
tasks. Despite these successes, their development faces two main challenges:(i) high …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Api-bank: A comprehensive benchmark for tool-augmented llms

M Li, Y Zhao, B Yu, F Song, H Li, H Yu, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent research has demonstrated that Large Language Models (LLMs) can enhance their
capabilities by utilizing external tools. However, three pivotal questions remain …

被引用次数：25 相关文章所有 4 个版本

[PDF] arxiv.org

XNLI: Evaluating cross-lingual sentence representations

A Conneau, G Lample, R Rinott, A Williams… - arXiv preprint arXiv …, 2018 - arxiv.org

State-of-the-art natural language processing systems rely on supervision in the form of
annotated data to learn competent models. These models are generally trained on data in a …

被引用次数：1263 相关文章所有 6 个版本

高级搜索

QQ 群

Massive: A 1m-example multilingual natural language understanding dataset with 51 typologically-diverse languages

Megaverse: Benchmarking large language models across languages, modalities, models and tasks

Baichuan 2: Open large-scale language models

Alexa teacher model: Pretraining and distilling multi-billion-parameter encoders for natural language understanding systems

Culturax: A cleaned, enormous, and multilingual dataset for large language models in 167 languages

Alignbench: Benchmarking chinese alignment of large language models

LINGUIST: Language model instruction tuning to generate annotated utterances for intent classification and slot tagging

Flm-101b: An open llm and how to train it with $100 k budget

Api-bank: A comprehensive benchmark for tool-augmented llms

XNLI: Evaluating cross-lingual sentence representations

引用