Pomo: Policy optimization with multiple optima for reinforcement learning

YD Kwon, J Choo, B Kim, I Yoon… - Advances in Neural …, 2020 - proceedings.neurips.cc
In neural combinatorial optimization (CO), reinforcement learning (RL) can turn a deep
neural net into a fast, powerful heuristic solver of NP-hard problems. This approach has a …

Reinforcement learning improves behaviour from evaluative feedback

ML Littman - Nature, 2015 - nature.com
Reinforcement learning is a branch of machine learning concerned with using experience
gained through interacting with the world and evaluative feedback to improve a system's …

Count-based exploration in feature space for reinforcement learning

J Martin, SN Sasikumar, T Everitt, M Hutter - arXiv preprint arXiv …, 2017 - arxiv.org
We introduce a new count-based optimistic exploration algorithm for Reinforcement
Learning (RL) that is feasible in environments with high-dimensional state-action spaces …

Reinforcement learning in robotics: A survey

J Kober, JA Bagnell, J Peters - The International Journal of …, 2013 - journals.sagepub.com
Reinforcement learning offers to robotics a framework and set of tools for the design of
sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic …

On simple reactive neural networks for behaviour-based reinforcement learning

A Pore, G Aragon-Camarasa - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
We present a behaviour-based reinforcement learning approach, inspired by Brook's
subsumption architecture, in which simple fully connected networks are trained as reactive …

Rethinking Population-assisted Off-policy Reinforcement Learning

B Zheng, R Cheng - Proceedings of the Genetic and Evolutionary …, 2023 - dl.acm.org
While off-policy reinforcement learning (RL) algorithms are sample efficient due to gradient-
based updates and data reuse in the replay buffer, they struggle with convergence to local …

Time limits in reinforcement learning

F Pardo, A Tavakoli, V Levdik… - … on Machine Learning, 2018 - proceedings.mlr.press
In reinforcement learning, it is common to let an agent interact for a fixed amount of time with
its environment before resetting it and repeating the process in a series of episodes. The …

[PDF][PDF] A Restart-based Rank-1 Evolution Strategy for Reinforcement Learning.

Z Chen, Y Zhou, X He, S Jiang - IJCAI, 2019 - ijcai.org
Evolution strategies have been demonstrated to have the strong ability to roughly train deep
neural networks and well accomplish reinforcement learning tasks. However, existing …

[PDF][PDF] Reinforcement Learning with Heuristic Information

T Brys - Dissertationm Vrije Universiteit Brussel, 2016 - ai.vub.ac.be
Reinforcement learning is becoming increasingly popular in machine learning communities
in academia and industry alike. Experimental successes in the past few years have hinted at …

Challenges of real-world reinforcement learning

G Dulac-Arnold, D Mankowitz, T Hester - arXiv preprint arXiv:1904.12901, 2019 - arxiv.org
Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is
beginning to show some successes in real-world scenarios. However, much of the research …