Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

[HTML][HTML] Brain-inspired learning in artificial neural networks: a review

S Schmidgall, R Ziaei, J Achterberg, L Kirsch… - APL Machine …, 2024 - pubs.aip.org
Artificial neural networks (ANNs) have emerged as an essential tool in machine learning,
achieving remarkable success across diverse domains, including image and speech …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

A survey on in-context learning

Q Dong, L Li, D Dai, C Zheng, J Ma, R Li, H Xia… - arXiv preprint arXiv …, 2022 - arxiv.org
With the increasing capabilities of large language models (LLMs), in-context learning (ICL)
has emerged as a new paradigm for natural language processing (NLP), where LLMs make …

Language models can solve computer tasks

G Kim, P Baldi, S McAleer - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Agents capable of carrying out general tasks on a computer can improve efficiency and
productivity by automating repetitive tasks and assisting in complex problem-solving. Ideally …

Larger language models do in-context learning differently

J Wei, J Wei, Y Tay, D Tran, A Webson, Y Lu… - arXiv preprint arXiv …, 2023 - arxiv.org
We study how in-context learning (ICL) in language models is affected by semantic priors
versus input-label mappings. We investigate two setups-ICL with flipped labels and ICL with …

Transformers learn to implement preconditioned gradient descent for in-context learning

K Ahn, X Cheng, H Daneshmand… - Advances in Neural …, 2023 - proceedings.neurips.cc
Several recent works demonstrate that transformers can implement algorithms like gradient
descent. By a careful construction of weights, these works show that multiple layers of …

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C Xiong… - Advances in neural …, 2024 - proceedings.neurips.cc
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

Jailbreak and guard aligned language models with only few in-context demonstrations

Z Wei, Y Wang, A Li, Y Mo, Y Wang - arXiv preprint arXiv:2310.06387, 2023 - arxiv.org
Large Language Models (LLMs) have shown remarkable success in various tasks, yet their
safety and the risk of generating harmful content remain pressing concerns. In this paper, we …

Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers

D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui… - arXiv preprint arXiv …, 2022 - arxiv.org
Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …