Language models are few-shot learners- 学术资源搜索

文章

学术资源搜索

Language models are few-shot learners

T Brown, B Mann, N Ryder… - Advances in neural …, 2020 - proceedings.neurips.cc

We demonstrate that scaling up language models greatly improves task-agnostic, few-shot
performance, sometimes even becoming competitive with prior state-of-the-art fine-tuning
approaches. Specifically, we train GPT-3, an autoregressive language model with 175
billion parameters, 10x more than any previous non-sparse language model, and test its
performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient
updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text …

被引用次数：39122 相关文章所有 31 个版本