Advances and challenges in meta-learning: A technical review

A Vettoruzzo, MR Bouguelia… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Meta-learning empowers learning systems with the ability to acquire knowledge from
multiple tasks, enabling faster adaptation and generalization to new tasks. This review …

[HTML][HTML] Brain-inspired learning in artificial neural networks: a review

S Schmidgall, R Ziaei, J Achterberg, L Kirsch… - APL Machine …, 2024 - pubs.aip.org
Artificial neural networks (ANNs) have emerged as an essential tool in machine learning,
achieving remarkable success across diverse domains, including image and speech …

Transformers learn in-context by gradient descent

J Von Oswald, E Niklasson… - International …, 2023 - proceedings.mlr.press
At present, the mechanisms of in-context learning in Transformers are not well understood
and remain mostly an intuition. In this paper, we suggest that training Transformers on auto …

Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers

D Dai, Y Sun, L Dong, Y Hao, S Ma, Z Sui… - arXiv preprint arXiv …, 2022 - arxiv.org
Large pretrained language models have shown surprising in-context learning (ICL) ability.
With a few demonstration input-label pairs, they can predict the label for an unseen input …

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C Xiong… - Advances in neural …, 2024 - proceedings.neurips.cc
Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

Transformers as algorithms: Generalization and stability in in-context learning

Y Li, ME Ildiz, D Papailiopoulos… - … on Machine Learning, 2023 - proceedings.mlr.press
In-context learning (ICL) is a type of prompting where a transformer model operates on a
sequence of (input, output) examples and performs inference on-the-fly. In this work, we …

Large language models as general pattern machines

S Mirchandani, F Xia, P Florence, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org
We observe that pre-trained large language models (LLMs) are capable of autoregressively
completing complex token sequences--from arbitrary ones procedurally generated by …

Supervised pretraining can learn in-context reinforcement learning

J Lee, A Xie, A Pacchiano, Y Chandak… - Advances in …, 2024 - proceedings.neurips.cc
Large transformer models trained on diverse datasets have shown a remarkable ability to
learn in-context, achieving high few-shot performance on tasks they were not explicitly …

Structured state space models for in-context reinforcement learning

C Lu, Y Schroecker, A Gu, E Parisotto… - Advances in …, 2024 - proceedings.neurips.cc
Structured state space sequence (S4) models have recently achieved state-of-the-art
performance on long-range sequence modeling tasks. These models also have fast …

Long-range transformers for dynamic spatiotemporal forecasting

J Grigsby, Z Wang, N Nguyen, Y Qi - arXiv preprint arXiv:2109.12218, 2021 - arxiv.org
Multivariate time series forecasting focuses on predicting future values based on historical
context. State-of-the-art sequence-to-sequence models rely on neural attention between …