Symbols and grounding in large language models

E Pavlick - … Transactions of the Royal Society A, 2023 - royalsocietypublishing.org
Large language models (LLMs) are one of the most impressive achievements of artificial
intelligence in recent years. However, their relevance to the study of language more broadly …

Semantic structure in deep learning

E Pavlick - Annual Review of Linguistics, 2022 - annualreviews.org
Deep learning has recently come to dominate computational linguistics, leading to claims of
human-level performance in a range of language processing tasks. Like much previous …

[HTML][HTML] Human-like systematic generalization through a meta-learning neural network

BM Lake, M Baroni - Nature, 2023 - nature.com
The power of human language and thought arises from systematic compositionality—the
algebraic ability to understand and produce novel combinations from known components …

Prompting gpt-3 to be reliable

C Si, Z Gan, Z Yang, S Wang, J Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Large language models (LLMs) show impressive abilities via few-shot prompting.
Commercialized APIs such as OpenAI GPT-3 further increase their use in real-world …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Dynabench: Rethinking benchmarking in NLP

D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger… - arXiv preprint arXiv …, 2021 - arxiv.org
We introduce Dynabench, an open-source platform for dynamic dataset creation and model
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …

Compositional semantic parsing with large language models

A Drozdov, N Schärli, E Akyürek, N Scales… - The Eleventh …, 2022 - openreview.net
Humans can reason compositionally when presented with new tasks. Previous research
shows that appropriate prompting techniques enable large language models (LLMs) to …

Ext5: Towards extreme multi-task scaling for transfer learning

V Aribandi, Y Tay, T Schuster, J Rao, HS Zheng… - arXiv preprint arXiv …, 2021 - arxiv.org
Despite the recent success of multi-task learning and transfer learning for natural language
processing (NLP), few works have systematically studied the effect of scaling up the number …

Wilds: A benchmark of in-the-wild distribution shifts

PW Koh, S Sagawa, H Marklund… - International …, 2021 - proceedings.mlr.press
Distribution shifts—where the training distribution differs from the test distribution—can
substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …

[HTML][HTML] A taxonomy and review of generalization research in NLP

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - Nature Machine …, 2023 - nature.com
The ability to generalize well is one of the primary desiderata for models of natural language
processing (NLP), but what 'good generalization'entails and how it should be evaluated is …