The bigscience roots corpus: A 1.6 tb composite multilingual dataset

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

被引用次数：244 相关文章所有 3 个版本

[PDF] arxiv.org

A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

被引用次数：207 相关文章所有 3 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：1798 相关文章所有 4 个版本

[PDF] mlr.press

The flan collection: Designing data and methods for effective instruction tuning

S Longpre, L Hou, T Vu, A Webson… - International …, 2023 - proceedings.mlr.press

We study the design decision of publicly available instruction tuning methods, by
reproducing and breaking down the development of Flan 2022 (Chung et al., 2022) …

被引用次数：414 相关文章所有 8 个版本

[PDF] neurips.cc

Laion-5b: An open large-scale dataset for training next generation image-text models

C Schuhmann, R Beaumont, R Vencu… - Advances in …, 2022 - proceedings.neurips.cc

Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of
training on large amounts of noisy image-text data, without relying on expensive accurate …

被引用次数：1879 相关文章所有 12 个版本

[PDF] arxiv.org

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arXiv preprint arXiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

被引用次数：451 相关文章所有 5 个版本

[PDF] mlr.press

Large language models struggle to learn long-tail knowledge

N Kandpal, H Deng, A Roberts… - International …, 2023 - proceedings.mlr.press

The Internet contains a wealth of knowledge—from the birthdays of historical figures to
tutorials on how to code—all of which may be learned by language models. However, while …

被引用次数：223 相关文章所有 8 个版本

[PDF] neurips.cc

Scaling data-constrained language models

N Muennighoff, A Rush, B Barak… - Advances in …, 2024 - proceedings.neurips.cc

The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …

被引用次数：111 相关文章所有 7 个版本

[PDF] neurips.cc

Emergent and predictable memorization in large language models

S Biderman, U Prashanth, L Sutawika… - Advances in …, 2024 - proceedings.neurips.cc

Memorization, or the tendency of large language models (LLMs) to output entire sequences
from their training data verbatim, is a key concern for deploying language models. In …

被引用次数：82 相关文章所有 5 个版本

[PDF] aaai.org

Preference ranking optimization for human alignment

F Song, B Yu, M Li, H Yu, F Huang, Y Li… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Large language models (LLMs) often contain misleading content, emphasizing the need to
align them with human values to ensure secure AI systems. Reinforcement learning from …

被引用次数：100 相关文章所有 4 个版本

高级搜索

QQ 群