[PDF][PDF] Smart exploration in reinforcement learning using absolute temporal difference errors

C Gehring, D Precup - … of the 2013 international conference on …, 2013 - aamas.csc.liv.ac.uk
Exploration is still one of the crucial problems in reinforcement learning, especially for
agents acting in safety-critical situations. We propose a new directed exploration method …

A reinforcement learning architecture that transfers knowledge between skills when solving multiple tasks

P Tommasino, D Caligiore, M Mirolli… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org
When humans learn several skills to solve multiple tasks, they exhibit an extraordinary
capacity to transfer knowledge between them. We present here the last enhanced version of …

Modeling sensory-motor decisions in natural behavior

R Zhang, S Zhang, MH Tong, Y Cui… - PLoS computational …, 2018 - journals.plos.org
Although a standard reinforcement learning model can capture many aspects of reward-
seeking behaviors, it may not be practical for modeling human natural behaviors because of …

Cooperative and competitive reinforcement and imitation learning for a mixture of heterogeneous learning modules

E Uchibe - Frontiers in neurorobotics, 2018 - frontiersin.org
This paper proposes Cooperative and competitive Reinforcement And Imitation Learning
(CRAIL) for selecting an appropriate policy from a set of multiple heterogeneous modules …

The two-dimensional organization of behavior

M Ring, T Schaul, J Schmidhuber - 2011 IEEE International …, 2011 - ieeexplore.ieee.org
This paper addresses the problem of continual learning [1] in a new way, combining multi-
modular reinforcement learning with inspiration from the motor cortex to produce a unique …

The organization of behavior into temporal and spatial neighborhoods

M Ring, T Schaul - 2012 IEEE International Conference on …, 2012 - ieeexplore.ieee.org
The mot 1 framework [1] is a system for learning behaviors while organizing them across a
two-dimensional, topological map such that similar behaviors are represented in nearby …

Studies in continuous black-box optimization

T Schaul - 2011 - mediatum.ub.tum.de
We present a collection of novel, state-of-the-art algorithms for solving problems in the class
of continuous black-box optimization. Natural Evolution Strategies are a family of algorithms …

Temporal Difference Weighted Ensemble For Reinforcement Learning

T Seno, M Imai - openreview.net
Combining multiple function approximators in machine learning models typically leads to
better performance and robustness compared with a single function. In reinforcement …

[PDF][PDF] Organizing Behavior into Temporal and Spatial Neighborhoods

M Ring, T Schaul - cs.utexas.edu
Abstract The mot1 framework (Ring, Schaul, and Schmidhuber 2011) is a system for
learning behaviors while organizing them across a two-dimensional, topological map such …

[引用][C] Seminar topics

M Baumann, T Kemmerich - 2012