A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …

A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities

Y Song, T Wang, P Cai, SK Mondal… - ACM Computing Surveys, 2023 - dl.acm.org
Few-shot learning (FSL) has emerged as an effective learning method and shows great
potential. Despite the recent creative works in tackling FSL tasks, learning valid information …

Toolformer: Language models can teach themselves to use tools

T Schick, J Dwivedi-Yu, R Dessì… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Language models (LMs) exhibit remarkable abilities to solve new tasks from just a
few examples or textual instructions, especially at scale. They also, paradoxically, struggle …

Large language models are human-level prompt engineers

Y Zhou, AI Muresanu, Z Han, K Paster, S Pitis… - arXiv preprint arXiv …, 2022 - arxiv.org
By conditioning on natural language instructions, large language models (LLMs) have
displayed impressive capabilities as general-purpose computers. However, task …

Parameter-efficient fine-tuning of large-scale pre-trained language models

N Ding, Y Qin, G Yang, F Wei, Z Yang, Y Su… - Nature Machine …, 2023 - nature.com
With the prevalence of pre-trained language models (PLMs) and the pre-training–fine-tuning
paradigm, it has been continuously shown that larger models tend to yield better …

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arXiv preprint arXiv …, 2022 - arxiv.org
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Challenging big-bench tasks and whether chain-of-thought can solve them

M Suzgun, N Scales, N Schärli, S Gehrmann… - arXiv preprint arXiv …, 2022 - arxiv.org
BIG-Bench (Srivastava et al., 2022) is a diverse evaluation suite that focuses on tasks
believed to be beyond the capabilities of current language models. Language models have …

Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning

H Liu, D Tam, M Muqeeth, J Mohta… - Advances in …, 2022 - proceedings.neurips.cc
Few-shot in-context learning (ICL) enables pre-trained language models to perform a
previously-unseen task without any gradient-based training by feeding a small number of …

Large language models can self-improve

J Huang, SS Gu, L Hou, Y Wu, X Wang, H Yu… - arXiv preprint arXiv …, 2022 - arxiv.org
Large Language Models (LLMs) have achieved excellent performances in various tasks.
However, fine-tuning an LLM requires extensive supervision. Human, on the other hand …

Gorilla: Large language model connected with massive apis

SG Patil, T Zhang, X Wang, JE Gonzalez - arXiv preprint arXiv:2305.15334, 2023 - arxiv.org
Large Language Models (LLMs) have seen an impressive wave of advances recently, with
models now excelling in a variety of tasks, such as mathematical reasoning and program …