Q-error as a selection mechanism in modular reinforcement-learning systems

C Gehring, D Precup - … of the 2013 international conference on …, 2013 - aamas.csc.liv.ac.uk

Exploration is still one of the crucial problems in reinforcement learning, especially for
agents acting in safety-critical situations. We propose a new directed exploration method …

被引用次数：115 相关文章所有 7 个版本

[PDF] ieee.org

A reinforcement learning architecture that transfers knowledge between skills when solving multiple tasks

P Tommasino, D Caligiore, M Mirolli… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org

When humans learn several skills to solve multiple tasks, they exhibit an extraordinary
capacity to transfer knowledge between them. We present here the last enhanced version of …

被引用次数：33 相关文章所有 6 个版本

[PDF] plos.org

Modeling sensory-motor decisions in natural behavior

R Zhang, S Zhang, MH Tong, Y Cui… - PLoS computational …, 2018 - journals.plos.org

Although a standard reinforcement learning model can capture many aspects of reward-
seeking behaviors, it may not be practical for modeling human natural behaviors because of …

被引用次数：15 相关文章所有 16 个版本

[PDF] frontiersin.org

Cooperative and competitive reinforcement and imitation learning for a mixture of heterogeneous learning modules

E Uchibe - Frontiers in neurorobotics, 2018 - frontiersin.org

This paper proposes Cooperative and competitive Reinforcement And Imitation Learning
(CRAIL) for selecting an appropriate policy from a set of multiple heterogeneous modules …

被引用次数：13 相关文章所有 10 个版本

[PDF] academia.edu

The two-dimensional organization of behavior

M Ring, T Schaul, J Schmidhuber - 2011 IEEE International …, 2011 - ieeexplore.ieee.org

This paper addresses the problem of continual learning [1] in a new way, combining multi-
modular reinforcement learning with inspiration from the motor cortex to produce a unique …

被引用次数：21 相关文章所有 13 个版本

[PDF] site44.com

The organization of behavior into temporal and spatial neighborhoods

M Ring, T Schaul - 2012 IEEE International Conference on …, 2012 - ieeexplore.ieee.org

The mot 1 framework [1] is a system for learning behaviors while organizing them across a
two-dimensional, topological map such that similar behaviors are represented in nearby …

被引用次数：6 相关文章所有 13 个版本

[PDF] tum.de

Studies in continuous black-box optimization

T Schaul - 2011 - mediatum.ub.tum.de

We present a collection of novel, state-of-the-art algorithms for solving problems in the class
of continuous black-box optimization. Natural Evolution Strategies are a family of algorithms …

被引用次数：4 相关文章所有 10 个版本

[PDF] openreview.net

Temporal Difference Weighted Ensemble For Reinforcement Learning

T Seno, M Imai - openreview.net

Combining multiple function approximators in machine learning models typically leads to
better performance and robustness compared with a single function. In reinforcement …

[PDF] utexas.edu

[PDF][PDF] Organizing Behavior into Temporal and Spatial Neighborhoods

M Ring, T Schaul - cs.utexas.edu

Abstract The mot1 framework (Ring, Schaul, and Schmidhuber 2011) is a system for
learning behaviors while organizing them across a two-dimensional, topological map such …

[引用][C] Seminar topics

M Baumann, T Kemmerich - 2012

高级搜索

QQ 群