Large language models as general pattern machines

S Mirchandani, F Xia, P Florence, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org
We observe that pre-trained large language models (LLMs) are capable of autoregressively
completing complex token sequences--from arbitrary ones procedurally generated by …

Transformers as decision makers: Provable in-context reinforcement learning via supervised pretraining

L Lin, Y Bai, S Mei - arXiv preprint arXiv:2310.08566, 2023 - arxiv.org
Large transformer models pretrained on offline reinforcement learning datasets have
demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they …

Reason for future, act for now: A principled framework for autonomous llm agents with provable sample efficiency

Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) demonstrate impressive reasoning abilities, but translating
reasoning into actions in the real world remains challenging. In particular, it remains unclear …

Rethinking decision transformer via hierarchical reinforcement learning

Y Ma, HAO Jianye, H Liang, C Xiao - Forty-first International …, 2023 - openreview.net
Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the
transformer architecture in reinforcement learning (RL). However, a notable limitation of DT …

Reason for future, act for now: A principled architecture for autonomous llm agents

Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu… - Forty-first International …, 2023 - openreview.net
Large language models (LLMs) demonstrate impressive reasoning abilities, but translating
reasoning into actions in the real world remains challenging. In particular, it is unclear how …

Can large language models explore in-context?

A Krishnamurthy, K Harris, DJ Foster, C Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
We investigate the extent to which contemporary Large Language Models (LLMs) can
engage in exploration, a core capability in reinforcement learning and decision making. We …

Do llm agents have regret? a case study in online learning and games

C Park, X Liu, A Ozdaglar, K Zhang - arXiv preprint arXiv:2403.16843, 2024 - arxiv.org
Large language models (LLMs) have been increasingly employed for (interactive) decision-
making, via the development of LLM-based autonomous agents. Despite their emerging …

In-context learning of a linear Transformer block: benefits of the MLP component and one-step GD initialization

R Zhang, J Wu, PL Bartlett - arXiv preprint arXiv:2402.14951, 2024 - arxiv.org
We study the\emph {in-context learning}(ICL) ability of a\emph {Linear Transformer
Block}(LTB) that combines a linear attention component and a linear multi-layer perceptron …

Towards Generalist Robot Learning from Internet Video: A Survey

R McCarthy, DCH Tan, D Schmidt, F Acero… - arXiv preprint arXiv …, 2024 - arxiv.org
This survey presents an overview of methods for learning from video (LfV) in the context of
reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large …

XLand-minigrid: Scalable meta-reinforcement learning environments in JAX

A Nikulin, V Kurenkov, I Zisman, A Agarkov… - arXiv preprint arXiv …, 2023 - arxiv.org
We present XLand-MiniGrid, a suite of tools and grid-world environments for meta-
reinforcement learning research inspired by the diversity and depth of XLand and the …