Supervised pretraining can learn in-context reinforcement learning

S Mirchandani, F Xia, P Florence, B Ichter… - arXiv preprint arXiv …, 2023 - arxiv.org

We observe that pre-trained large language models (LLMs) are capable of autoregressively
completing complex token sequences--from arbitrary ones procedurally generated by …

被引用次数：115 相关文章所有 4 个版本

[PDF] arxiv.org

Transformers as decision makers: Provable in-context reinforcement learning via supervised pretraining

L Lin, Y Bai, S Mei - arXiv preprint arXiv:2310.08566, 2023 - arxiv.org

Large transformer models pretrained on offline reinforcement learning datasets have
demonstrated remarkable in-context reinforcement learning (ICRL) capabilities, where they …

被引用次数：23 相关文章所有 4 个版本

[PDF] arxiv.org

Reason for future, act for now: A principled framework for autonomous llm agents with provable sample efficiency

Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) demonstrate impressive reasoning abilities, but translating
reasoning into actions in the real world remains challenging. In particular, it remains unclear …

被引用次数：18 相关文章所有 2 个版本

[PDF] openreview.net

Rethinking decision transformer via hierarchical reinforcement learning

Y Ma, HAO Jianye, H Liang, C Xiao - Forty-first International …, 2023 - openreview.net

Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the
transformer architecture in reinforcement learning (RL). However, a notable limitation of DT …

被引用次数：4 相关文章所有 3 个版本

[PDF] openreview.net

Reason for future, act for now: A principled architecture for autonomous llm agents

Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu… - Forty-first International …, 2023 - openreview.net

Large language models (LLMs) demonstrate impressive reasoning abilities, but translating
reasoning into actions in the real world remains challenging. In particular, it is unclear how …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Can large language models explore in-context?

A Krishnamurthy, K Harris, DJ Foster, C Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

We investigate the extent to which contemporary Large Language Models (LLMs) can
engage in exploration, a core capability in reinforcement learning and decision making. We …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Do llm agents have regret? a case study in online learning and games

C Park, X Liu, A Ozdaglar, K Zhang - arXiv preprint arXiv:2403.16843, 2024 - arxiv.org

Large language models (LLMs) have been increasingly employed for (interactive) decision-
making, via the development of LLM-based autonomous agents. Despite their emerging …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

In-context learning of a linear Transformer block: benefits of the MLP component and one-step GD initialization

R Zhang, J Wu, PL Bartlett - arXiv preprint arXiv:2402.14951, 2024 - arxiv.org

We study the\emph {in-context learning}(ICL) ability of a\emph {Linear Transformer
Block}(LTB) that combines a linear attention component and a linear multi-layer perceptron …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Towards Generalist Robot Learning from Internet Video: A Survey

R McCarthy, DCH Tan, D Schmidt, F Acero… - arXiv preprint arXiv …, 2024 - arxiv.org

This survey presents an overview of methods for learning from video (LfV) in the context of
reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

XLand-minigrid: Scalable meta-reinforcement learning environments in JAX

A Nikulin, V Kurenkov, I Zisman, A Agarkov… - arXiv preprint arXiv …, 2023 - arxiv.org

We present XLand-MiniGrid, a suite of tools and grid-world environments for meta-
reinforcement learning research inspired by the diversity and depth of XLand and the …

被引用次数：8 相关文章所有 4 个版本

高级搜索

QQ 群