The arcade learning environment: An evaluation platform for general agents

RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …

被引用次数：211 相关文章所有 7 个版本

[PDF] arxiv.org

Exploration in deep reinforcement learning: A survey

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier

This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

被引用次数：183 相关文章所有 4 个版本

[PDF] arxiv.org

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org

Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

被引用次数：734 相关文章所有 4 个版本

[PDF] arxiv.org

Mastering diverse domains through world models

D Hafner, J Pasukonis, J Ba, T Lillicrap - arXiv preprint arXiv:2301.04104, 2023 - arxiv.org

General intelligence requires solving tasks across many domains. Current reinforcement
learning algorithms carry this potential but are held back by the resources and knowledge …

被引用次数：291 相关文章所有 3 个版本

[PDF] neurips.cc

Multi-game decision transformers

KH Lee, O Nachum, MS Yang, L Lee… - Advances in …, 2022 - proceedings.neurips.cc

A longstanding goal of the field of AI is a method for learning a highly capable, generalist
agent from diverse experience. In the subfields of vision and language, this was largely …

被引用次数：176 相关文章所有 8 个版本

[PDF] neurips.cc

Deep reinforcement learning at the edge of the statistical precipice

R Agarwal, M Schwarzer, PS Castro… - Advances in neural …, 2021 - proceedings.neurips.cc

Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing
their relative performance on a large suite of tasks. Most published results on deep RL …

被引用次数：542 相关文章所有 6 个版本

[PDF] neurips.cc

Decision transformer: Reinforcement learning via sequence modeling

L Chen, K Lu, A Rajeswaran, K Lee… - Advances in neural …, 2021 - proceedings.neurips.cc

We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …

被引用次数：1271 相关文章所有 10 个版本

[PDF] neurips.cc

Habitat 2.0: Training home assistants to rearrange their habitat

A Szot, A Clegg, E Undersander… - Advances in neural …, 2021 - proceedings.neurips.cc

Abstract We introduce Habitat 2.0 (H2. 0), a simulation platform for training virtual robots in
interactive 3D environments and complex physics-enabled scenarios. We make …

被引用次数：404 相关文章所有 6 个版本

[PDF] mlr.press

Bigger, better, faster: Human-level atari with human-level efficiency

M Schwarzer, JSO Ceron, A Courville… - International …, 2023 - proceedings.mlr.press

We introduce a value-based RL agent, which we call BBF, that achieves super-human
performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used …

被引用次数：44 相关文章所有 7 个版本

[PDF] neurips.cc

Uncertainty-based offline reinforcement learning with diversified q-ensemble

G An, S Moon, JH Kim… - Advances in neural …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (offline RL), which aims to find an optimal policy from a
previously collected static dataset, bears algorithmic difficulties due to function …

被引用次数：221 相关文章所有 7 个版本

高级搜索

QQ 群