Data distributional properties drive emergent in-context learning in transformers

H Ye, T Liu, A Zhang, W Hua, W Jia - arXiv preprint arXiv:2309.06794, 2023 - arxiv.org

As large language models continue to develop in the field of AI, text generation systems are
susceptible to a worrisome phenomenon known as hallucination. In this study, we …

被引用次数：74 相关文章所有 2 个版本

[PDF] arxiv.org

A social path to human-like artificial intelligence

EA Duéñez-Guzmán, S Sadedin, JX Wang… - Nature Machine …, 2023 - nature.com

Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a
property of unitary agents devoid of social context. Given the success of contemporary …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：2407 相关文章所有 4 个版本

A survey on in-context learning

Q Dong, L Li, D Dai, C Zheng, Z Wu, B Chang… - arXiv preprint arXiv …, 2022 - arxiv.org

With the increasing ability of large language models (LLMs), in-context learning (ICL) has
become a new paradigm for natural language processing (NLP), where LLMs make …

被引用次数：972 相关文章所有 2 个版本

[PDF] neurips.cc

Are emergent abilities of large language models a mirage?

R Schaeffer, B Miranda… - Advances in Neural …, 2024 - proceedings.neurips.cc

Recent work claims that large language models display\textit {emergent abilities}, abilities
not present in smaller-scale models that are present in larger-scale models. What makes …

被引用次数：321 相关文章所有 9 个版本

[PDF] mlr.press

Transformers learn in-context by gradient descent

J Von Oswald, E Niklasson… - International …, 2023 - proceedings.mlr.press

At present, the mechanisms of in-context learning in Transformers are not well understood
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …

被引用次数：310 相关文章所有 9 个版本

[PDF] arxiv.org

Emergent analogical reasoning in large language models

T Webb, KJ Holyoak, H Lu - Nature Human Behaviour, 2023 - nature.com

The recent advent of large language models has reinvigorated debate over whether human
cognitive capacities might emerge in such generic models given sufficient training data. Of …

被引用次数：290 相关文章所有 9 个版本

[PDF] neurips.cc

Datacomp: In search of the next generation of multimodal datasets

SY Gadre, G Ilharco, A Fang… - Advances in …, 2024 - proceedings.neurips.cc

Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable
Diffusion and GPT-4, yet their design does not receive the same research attention as model …

被引用次数：233 相关文章所有 9 个版本

[PDF] neurips.cc

What can transformers learn in-context? a case study of simple function classes

S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022 - proceedings.neurips.cc

In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …

被引用次数：306 相关文章所有 7 个版本

[PDF] arxiv.org

Larger language models do in-context learning differently

J Wei, J Wei, Y Tay, D Tran, A Webson, Y Lu… - arXiv preprint arXiv …, 2023 - arxiv.org

We study how in-context learning (ICL) in language models is affected by semantic priors
versus input-label mappings. We investigate two setups-ICL with flipped labels and ICL with …

被引用次数：217 相关文章所有 7 个版本

高级搜索

QQ 群