Large language models (LMs) are able to in-context learn--perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for …
Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on …
Existing techniques for training language models can be misaligned with the truth: if we train models with imitation learning, they may reproduce errors that humans make; if we train …
The past decade has witnessed dramatic gains in natural language processing and an unprecedented scaling of large language models. These developments have been …
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT). In this paper, we systematically investigate the …
Large language models (LLMs) such as GPT-3 and GPT-4 are powerful but their weights are often publicly unavailable and their immense sizes make the models difficult to be tuned with …
We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer language models (LMs). Using causal …
In-context learning (ICL) emerges as a promising capability of large language models (LLMs) by providing them with demonstration examples to perform diverse tasks. However …
The emergent few-shot reasoning capabilities of Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of …