COGS: A compositional generalization challenge based on semantic interpretation

E Pavlick - … Transactions of the Royal Society A, 2023 - royalsocietypublishing.org

Large language models (LLMs) are one of the most impressive achievements of artificial
intelligence in recent years. However, their relevance to the study of language more broadly …

被引用次数：61 相关文章所有 4 个版本

Semantic structure in deep learning

E Pavlick - Annual Review of Linguistics, 2022 - annualreviews.org

Deep learning has recently come to dominate computational linguistics, leading to claims of
human-level performance in a range of language processing tasks. Like much previous …

被引用次数：54 相关文章所有 2 个版本

[HTML] nature.com

[HTML][HTML] Human-like systematic generalization through a meta-learning neural network

BM Lake, M Baroni - Nature, 2023 - nature.com

The power of human language and thought arises from systematic compositionality—the
algebraic ability to understand and produce novel combinations from known components …

被引用次数：111 相关文章所有 14 个版本

[PDF] arxiv.org

Prompting gpt-3 to be reliable

C Si, Z Gan, Z Yang, S Wang, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

Large language models (LLMs) show impressive abilities via few-shot prompting.
Commercialized APIs such as OpenAI GPT-3 further increase their use in real-world …

被引用次数：187 相关文章所有 3 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：3395 相关文章所有 2 个版本

[PDF] arxiv.org

Dynabench: Rethinking benchmarking in NLP

D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger… - arXiv preprint arXiv …, 2021 - arxiv.org

We introduce Dynabench, an open-source platform for dynamic dataset creation and model
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …

被引用次数：333 相关文章所有 9 个版本

[PDF] openreview.net

Compositional semantic parsing with large language models

A Drozdov, N Schärli, E Akyürek, N Scales… - The Eleventh …, 2022 - openreview.net

Humans can reason compositionally when presented with new tasks. Previous research
shows that appropriate prompting techniques enable large language models (LLMs) to …

被引用次数：97 相关文章所有 3 个版本

[PDF] arxiv.org

Ext5: Towards extreme multi-task scaling for transfer learning

V Aribandi, Y Tay, T Schuster, J Rao, HS Zheng… - arXiv preprint arXiv …, 2021 - arxiv.org

Despite the recent success of multi-task learning and transfer learning for natural language
processing (NLP), few works have systematically studied the effect of scaling up the number …

被引用次数：185 相关文章所有 6 个版本

[PDF] mlr.press

Wilds: A benchmark of in-the-wild distribution shifts

PW Koh, S Sagawa, H Marklund… - International …, 2021 - proceedings.mlr.press

Distribution shifts—where the training distribution differs from the test distribution—can
substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …

被引用次数：1286 相关文章所有 13 个版本

[HTML] nature.com

[HTML][HTML] A taxonomy and review of generalization research in NLP

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - Nature Machine …, 2023 - nature.com

The ability to generalize well is one of the primary desiderata for models of natural language
processing (NLP), but what 'good generalization'entails and how it should be evaluated is …

被引用次数：34 相关文章所有 9 个版本

高级搜索

QQ 群