Emergent structures and training dynamics in large language models- 学术资源搜索

Emergent structures and training dynamics in large language models

R Teehan, M Clinciu, O Serikov… - … # 5--Workshop on …, 2022 - aclanthology.org

R Teehan, M Clinciu, O Serikov, E Szczechla, N Seelam, S Mirkin, A Gokaslan

Proceedings of BigScience Episode# 5--Workshop on Challenges …, 2022•aclanthology.org

Abstract

Large language models have achieved success on a number of downstream tasks, particularly in a few and zero-shot manner. As a consequence, researchers have been investigating both the kind of information these networks learn and how such information can be encoded in the parameters of the model. We survey the literature on changes in the network during training, drawing from work outside of NLP when necessary, and on learned representations of linguistic features in large language models. We note in particular the lack of sufficient research on the emergence of functional units, subsections of the network where related functions are grouped or organised, within large language models and motivate future work that grounds the study of language models in an analysis of their changing internal structure during training time.

aclanthology.org

展开收起

被引用次数：20 相关文章所有 6 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Emergent structures and training dynamics in large language models

引用