Autonomous discovery of temporal abstractions from interaction with an environment

AG Barto, S Mahadevan - Discrete event dynamic systems, 2003 - Springer

Reinforcement learning is bedeviled by the curse of dimensionality: the number of
parameters to be learned grows exponentially with the size of any compact encoding of a …

被引用次数：1745 相关文章所有 23 个版本

[HTML] nih.gov

Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective

MM Botvinick, Y Niv, AG Barto - cognition, 2009 - Elsevier

Research on human and animal behavior has long emphasized its hierarchical structure—
the divisibility of ongoing behavior into discrete tasks, which are comprised of subtask …

被引用次数：774 相关文章所有 30 个版本

[PDF] mlr.press

Zero-shot task generalization with multi-task deep reinforcement learning

J Oh, S Singh, H Lee, P Kohli - International Conference on …, 2017 - proceedings.mlr.press

As a step towards developing zero-shot task generalization capabilities in reinforcement
learning (RL), we introduce a new RL problem where the agent should learn to execute …

被引用次数：311 相关文章所有 9 个版本

[HTML] neurips.cc

[HTML][HTML] Intrinsically motivated reinforcement learning

N Chentanez, A Barto, S Singh - Advances in neural …, 2004 - proceedings.neurips.cc

Psychologists call behavior intrinsically motivated when it is engaged in for its own sake
rather than as a step toward solving a specific problem of clear practical value. But what we …

被引用次数：1025 相关文章所有 21 个版本

[HTML] nih.gov

Habits, action sequences and reinforcement learning

A Dezfouli, BW Balleine - European Journal of Neuroscience, 2012 - Wiley Online Library

It is now widely accepted that instrumental actions can be either goal‐directed or habitual;
whereas the former are rapidly acquired and regulated by their outcome, the latter are …

被引用次数：390 相关文章所有 11 个版本

[PDF] psu.edu

[PDF][PDF] Intrinsically motivated learning of hierarchical collections of skills

AG Barto, S Singh, N Chentanez - Proceedings of the 3rd International …, 2004 - Citeseer

Humans and other animals often engage in activities for their own sakes rather than as steps
toward solving practical problems. Psychologists call these intrinsically motivated behaviors …

被引用次数：567 相关文章所有 11 个版本

[PDF] jmlr.org

[PDF][PDF] Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes.

S Mahadevan, M Maggioni - Journal of Machine Learning Research, 2007 - jmlr.org

This paper introduces a novel spectral framework for solving Markov decision processes
(MDPs) by jointly learning representations and optimal policies. The major components of …

被引用次数：397 相关文章所有 20 个版本

[PDF] umass.edu

Using relative novelty to identify useful temporal abstractions in reinforcement learning

Ö Şimşek, AG Barto - Proceedings of the twenty-first international …, 2004 - dl.acm.org

We present a new method for automatically creating useful temporal abstractions in
reinforcement learning. We argue that states that allow the agent to transition to a different …

被引用次数：292 相关文章所有 8 个版本

[PDF] mlr.press

Skill discovery for exploration and planning using deep skill graphs

A Bagaria, JK Senthil… - … Conference on Machine …, 2021 - proceedings.mlr.press

We introduce a new skill-discovery algorithm that builds a discrete graph representation of
large continuous MDPs, where nodes correspond to skill subgoals and the edges to skill …

被引用次数：44 相关文章所有 14 个版本

[PDF] aaai.org

Investigating contingency awareness using Atari 2600 games

M Bellemare, J Veness, M Bowling - … of the AAAI Conference on Artificial …, 2012 - ojs.aaai.org

Contingency awareness is the recognition that some aspects of a future observation are
under an agent's control while others are solely determined by the environment. This paper …

被引用次数：114 相关文章所有 13 个版本

高级搜索

QQ 群