Imagination-augmented agents for deep reinforcement learning

C Yu, J Liu, S Nemati, G Yin - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

As a subfield of machine learning, reinforcement learning (RL) aims at optimizing decision
making by using interaction samples of an agent with its environment and the potentially …

被引用次数：680 相关文章所有 5 个版本

[PDF] zjujournals.com

Deep reinforcement learning: a survey

H Wang, N Liu, Y Zhang, D Feng, F Huang, D Li… - Frontiers of Information …, 2020 - Springer

Deep reinforcement learning (RL) has become one of the most popular topics in artificial
intelligence research. It has been widely used in various fields, such as end-to-end control …

被引用次数：232 相关文章所有 11 个版本

[PDF] arxiv.org

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arXiv preprint arXiv …, 2022 - arxiv.org

Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

被引用次数：861 相关文章所有 4 个版本

[PDF] ieee.org

A metaverse: Taxonomy, components, applications, and open challenges

SM Park, YG Kim - IEEE access, 2022 - ieeexplore.ieee.org

Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is
based on the social value of Generation Z that online and offline selves are not different …

被引用次数：1624 相关文章所有 6 个版本

[PDF] jair.org Full View

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

被引用次数：356 相关文章所有 9 个版本

[PDF] arxiv.org

Foundation models for decision making: Problems, methods, and opportunities

S Yang, O Nachum, Y Du, J Wei, P Abbeel… - arXiv preprint arXiv …, 2023 - arxiv.org

Foundation models pretrained on diverse data at scale have demonstrated extraordinary
capabilities in a wide range of vision and language tasks. When such models are deployed …

被引用次数：118 相关文章所有 3 个版本

[PDF] neurips.cc

Mopo: Model-based offline policy optimization

T Yu, G Thomas, L Yu, S Ermon… - Advances in …, 2020 - proceedings.neurips.cc

Offline reinforcement learning (RL) refers to the problem of learning policies entirely from a
batch of previously collected data. This problem setting is compelling, because it offers the …

被引用次数：810 相关文章所有 11 个版本

[PDF] neurips.cc

When to trust your model: Model-based policy optimization

M Janner, J Fu, M Zhang… - Advances in neural …, 2019 - proceedings.neurips.cc

Designing effective model-based reinforcement learning algorithms is difficult because the
ease of data generation must be weighed against the bias of model-generated data. In this …

被引用次数：1000 相关文章所有 10 个版本

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

被引用次数：786 相关文章所有 17 个版本

[PDF] neurips.cc

Recurrent world models facilitate policy evolution

D Ha, J Schmidhuber - Advances in neural information …, 2018 - proceedings.neurips.cc

A generative recurrent neural network is quickly trained in an unsupervised manner to
model popular reinforcement learning environments through compressed spatio-temporal …

被引用次数：1067 相关文章所有 9 个版本

高级搜索

QQ 群