Exploration in model-based reinforcement learning by empirically estimating learning progress

PY Oudeyer, J Gottlieb, M Lopes - Progress in brain research, 2016 - Elsevier

This chapter studies the bidirectional causal interactions between curiosity and learning and
discusses how understanding these interactions can be leveraged in educational …

被引用次数：458 相关文章所有 13 个版本

[HTML] nih.gov

Information-seeking, curiosity, and attention: computational and neural mechanisms

J Gottlieb, PY Oudeyer, M Lopes, A Baranes - Trends in cognitive sciences, 2013 - cell.com

Intelligent animals devote much time and energy to exploring and obtaining information, but
the underlying mechanisms are poorly understood. We review recent developments on this …

被引用次数：944 相关文章所有 21 个版本

[PDF] mlr.press

Planning to explore via self-supervised world models

R Sekar, O Rybkin, K Daniilidis… - International …, 2020 - proceedings.mlr.press

Reinforcement learning allows solving complex tasks, however, the learning tends to be task-
specific and the sample efficiency remains a challenge. We present Plan2Explore, a self …

被引用次数：393 相关文章所有 8 个版本

[PDF] ed.ac.uk

Exploration by random network distillation

Y Burda, H Edwards, A Storkey, O Klimov - arXiv preprint arXiv …, 2018 - arxiv.org

We introduce an exploration bonus for deep reinforcement learning methods that is easy to
implement and adds minimal overhead to the computation performed. The bonus is the error …

被引用次数：1385 相关文章所有 10 个版本

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

被引用次数：692 相关文章所有 17 个版本

[PDF] arxiv.org

Large-scale study of curiosity-driven learning

Y Burda, H Edwards, D Pathak, A Storkey… - arXiv preprint arXiv …, 2018 - arxiv.org

Reinforcement learning algorithms rely on carefully engineering environment rewards that
are extrinsic to the agent. However, annotating each environment with hand-designed …

被引用次数：852 相关文章所有 9 个版本

[PDF] mlr.press

Curiosity-driven exploration by self-supervised prediction

D Pathak, P Agrawal, AA Efros… - … conference on machine …, 2017 - proceedings.mlr.press

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent
altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the …

被引用次数：2750 相关文章所有 16 个版本

[PDF] mlr.press

Self-supervised exploration via disagreement

D Pathak, D Gandhi, A Gupta - International conference on …, 2019 - proceedings.mlr.press

Efficient exploration is a long-standing problem in sensorimotor learning. Major advances
have been demonstrated in noise-free, non-stochastic domains such as video games and …

被引用次数：402 相关文章所有 6 个版本

[PDF] neurips.cc

Unifying count-based exploration and intrinsic motivation

M Bellemare, S Srinivasan… - Advances in neural …, 2016 - proceedings.neurips.cc

We consider an agent's uncertainty about its environment and the problem of generalizing
this uncertainty across states. Specifically, we focus on the problem of exploration in non …

被引用次数：1667 相关文章所有 9 个版本

[PDF] mlr.press

Count-based exploration with neural density models

G Ostrovski, MG Bellemare, A Oord… - … on machine learning, 2017 - proceedings.mlr.press

Abstract Bellemare et al.(2016) introduced the notion of a pseudo-count, derived from a
density model, to generalize count-based exploration to non-tabular reinforcement learning …

被引用次数：710 相关文章所有 5 个版本

高级搜索

QQ 群