Rl unplugged: A suite of benchmarks for offline reinforcement learning

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

被引用次数：230 相关文章所有 9 个版本

[PDF] arxiv.org

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org

Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

被引用次数：776 相关文章所有 4 个版本

[PDF] thecvf.com

Dataset distillation by matching training trajectories

G Cazenavette, T Wang, A Torralba… - Proceedings of the …, 2022 - openaccess.thecvf.com

Dataset distillation is the task of synthesizing a small dataset such that a model trained on
the synthetic set will match the test accuracy of the model trained on the full dataset. In this …

被引用次数：275 相关文章所有 9 个版本

[PDF] neurips.cc

Multi-game decision transformers

KH Lee, O Nachum, MS Yang, L Lee… - Advances in …, 2022 - proceedings.neurips.cc

A longstanding goal of the field of AI is a method for learning a highly capable, generalist
agent from diverse experience. In the subfields of vision and language, this was largely …

被引用次数：185 相关文章所有 10 个版本

[PDF] jair.org Full View

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：321 相关文章所有 9 个版本

[PDF] mlr.press

Implicit behavioral cloning

P Florence, C Lynch, A Zeng… - … on Robot Learning, 2022 - proceedings.mlr.press

We find that across a wide range of robot policy learning scenarios, treating supervised
policy learning with an implicit model generally performs better, on average, than commonly …

被引用次数：282 相关文章所有 9 个版本

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y Jin, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

被引用次数：392 相关文章所有 7 个版本

[PDF] arxiv.org

Foundation models for decision making: Problems, methods, and opportunities

S Yang, O Nachum, Y Du, J Wei, P Abbeel… - arXiv preprint arXiv …, 2023 - arxiv.org

Foundation models pretrained on diverse data at scale have demonstrated extraordinary
capabilities in a wide range of vision and language tasks. When such models are deployed …

被引用次数：95 相关文章所有 3 个版本

[PDF] openreview.net

What matters in learning from offline human demonstrations for robot manipulation

A Mandlekar, D Xu, J Wong, S Nasiriany… - arXiv preprint arXiv …, 2021 - arxiv.org

Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …

被引用次数：292 相关文章所有 4 个版本

[PDF] mlr.press

Accelerating reinforcement learning with learned skill priors

K Pertsch, Y Lee, J Lim - Conference on robot learning, 2021 - proceedings.mlr.press

Intelligent agents rely heavily on prior experience when learning a new task, yet most
modern reinforcement learning (RL) approaches learn every task from scratch. One …

被引用次数：224 相关文章所有 4 个版本

高级搜索

QQ 群