相关文章- 学术资源搜索

Multi-game decision transformers

KH Lee, O Nachum, MS Yang, L Lee… - Advances in …, 2022 - proceedings.neurips.cc

A longstanding goal of the field of AI is a method for learning a highly capable, generalist
agent from diverse experience. In the subfields of vision and language, this was largely …

被引用次数：236 相关文章所有 10 个版本

[PDF] arxiv.org

Mastering atari with discrete world models

D Hafner, T Lillicrap, M Norouzi, J Ba - arXiv preprint arXiv:2010.02193, 2020 - arxiv.org

Intelligent agents need to generalize from past experience to achieve goals in complex
environments. World models facilitate such generalization and allow learning behaviors …

被引用次数：920 相关文章所有 7 个版本

[PDF] arxiv.org

Minatar: An atari-inspired testbed for thorough and reproducible reinforcement learning experiments

K Young, T Tian - arXiv preprint arXiv:1903.03176, 2019 - arxiv.org

The Arcade Learning Environment (ALE) is a popular platform for evaluating reinforcement
learning agents. Much of the appeal comes from the fact that Atari games demonstrate …

被引用次数：121 相关文章所有 2 个版本

[PDF] openreview.net

Investigating multi-task pretraining and generalization in reinforcement learning

AA Taiga, R Agarwal, J Farebrother… - The Eleventh …, 2023 - openreview.net

Deep reinforcement learning~(RL) has achieved remarkable successes in complex single-
task settings. However, designing RL agents that can learn multiple tasks and leverage prior …

被引用次数：34 相关文章所有 2 个版本

[PDF] arxiv.org

Model-based reinforcement learning for atari

L Kaiser, M Babaeizadeh, P Milos, B Osinski… - arXiv preprint arXiv …, 2019 - arxiv.org

Model-free reinforcement learning (RL) can be used to learn effective policies for complex
tasks, such as Atari games, even from image observations. However, this typically requires …

被引用次数：1055 相关文章所有 6 个版本

[PDF] mlr.press

Contrastive decision transformers

SG Konan, E Seraj… - Conference on Robot …, 2023 - proceedings.mlr.press

Decision Transformers (DT) have drawn upon the success of Transformers by abstracting
Reinforcement Learning as a target-return-conditioned, sequence modeling problem. In our …

被引用次数：18 相关文章所有 2 个版本

[PDF] mlr.press

Emergent agentic transformer from chain of hindsight experience

H Liu, P Abbeel - International Conference on Machine …, 2023 - proceedings.mlr.press

Large transformer models powered by diverse data and model scale have dominated
natural language modeling and computer vision and pushed the frontier of multiple AI areas …

被引用次数：21 相关文章所有 6 个版本

[PDF] neurips.cc

Reward learning from human preferences and demonstrations in atari

B Ibarz, J Leike, T Pohlen, G Irving… - Advances in neural …, 2018 - proceedings.neurips.cc

To solve complex real-world problems with reinforcement learning, we cannot rely on
manually specified reward functions. Instead, we need humans to communicate an objective …

被引用次数：449 相关文章所有 7 个版本

[PDF] arxiv.org

Using natural language for reward shaping in reinforcement learning

P Goyal, S Niekum, RJ Mooney - arXiv preprint arXiv:1903.02020, 2019 - arxiv.org

Recent reinforcement learning (RL) approaches have shown strong performance in complex
domains such as Atari games, but are often highly sample inefficient. A common approach to …

被引用次数：192 相关文章所有 10 个版本

[PDF] mlr.press

Agent57: Outperforming the atari human benchmark

AP Badia, B Piot, S Kapturowski… - International …, 2020 - proceedings.mlr.press

Atari games have been a long-standing benchmark in the reinforcement learning (RL)
community for the past decade. This benchmark was proposed to test general competency …

被引用次数：703 相关文章所有 5 个版本

高级搜索

QQ 群

Multi-game decision transformers

Mastering atari with discrete world models

Minatar: An atari-inspired testbed for thorough and reproducible reinforcement learning experiments

Investigating multi-task pretraining and generalization in reinforcement learning

Model-based reinforcement learning for atari

Contrastive decision transformers

Emergent agentic transformer from chain of hindsight experience

Reward learning from human preferences and demonstrations in atari

Using natural language for reward shaping in reinforcement learning

Agent57: Outperforming the atari human benchmark

引用