Emergent structures and training dynamics in large language models

J Schneider - Artificial Intelligence Review, 2024 - Springer

Generative AI (GenAI) represents a shift from AI's ability to “recognize” to its ability to
“generate” solutions for a wide range of tasks. As generated solutions and applications grow …

被引用次数：15 相关文章所有 3 个版本

[PDF] hal.science

Bloom: A 176b-parameter open-access multilingual language model

T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow… - 2023 - inria.hal.science

Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

被引用次数：1731 相关文章所有 16 个版本

[PDF] arxiv.org

Emergent abilities of large language models

J Wei, Y Tay, R Bommasani, C Raffel, B Zoph… - arXiv preprint arXiv …, 2022 - arxiv.org

Scaling up language models has been shown to predictably improve performance and
sample efficiency on a wide range of downstream tasks. This paper instead discusses an …

被引用次数：2701 相关文章所有 7 个版本

[PDF] mit.edu

Do Vision and Language Models Share Concepts? A Vector Space Alignment Study

J Li, Y Kementchedjhieva, C Fierro… - Transactions of the …, 2024 - direct.mit.edu

Large-scale pretrained language models (LMs) are said to “lack the ability to connect
utterances to the world”(Bender and Koller,), because they do not have “mental models of …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Why Can Large Language Models Generate Correct Chain-of-Thoughts?

R Tutunov, A Grosnit, J Ziomek, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

This paper delves into the capabilities of large language models (LLMs), specifically
focusing on advancing the theoretical comprehension of chain-of-thought prompting. We …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Subspace chronicles: How linguistic information emerges, shifts and interacts during language model training

M Müller-Eberstein, R Van Der Goot, B Plank… - arXiv preprint arXiv …, 2023 - arxiv.org

Representational spaces learned via language modeling are fundamental to Natural
Language Processing (NLP), however there has been limited understanding regarding how …

被引用次数：5 相关文章所有 7 个版本

[PDF] arxiv.org

Implications of the convergence of language and vision model geometries

J Li, Y Kementchedjhieva, A Søgaard - arXiv preprint arXiv:2302.06555, 2023 - arxiv.org

Large-scale pretrained language models (LMs) are said to``lack the ability to connect [their]
utterances to the world''(Bender and Koller, 2020). If so, we would expect LM …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Bloom: A 176b-parameter open-access multilingual language model

BS Workshop, TL Scao, A Fan, C Akiki… - arXiv preprint arXiv …, 2022 - arxiv.org

Large language models (LLMs) have been shown to be able to perform new tasks based on
a few demonstrations or natural language instructions. While these capabilities have led to …

被引用次数：3 相关文章所有 7 个版本

[PDF] arxiv.org

Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games

J Choi, YB Kim - arXiv preprint arXiv:2409.06518, 2024 - arxiv.org

Large language models (LLMs) have become a dominant approach in natural language
processing, yet their internal knowledge structures remain largely unexplored. In this paper …

[PDF][PDF] A Categorical Framework for Quantifying Emergent Effects in Network Topology

JJ Li, SP Guerra, K Basu, GA Silva - arXiv preprint arXiv:2311.17403, 2023 - silva.ucsd.edu

Emergent effect is crucial to the understanding of the properties of complex systems that do
not appear in their basic units, but there has been a lack of theories to measure and …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群