World models and predictive coding for cognitive and developmental robotics: Frontiers and challenges

T Taniguchi, S Murata, M Suzuki, D Ognibene… - Advanced …, 2023 - Taylor & Francis
Creating autonomous robots that can actively explore the environment, acquire knowledge
and learn skills continuously is the ultimate achievement envisioned in cognitive and …

A unified framework for stochastic optimization

WB Powell - European Journal of Operational Research, 2019 - Elsevier
Stochastic optimization is an umbrella term that includes over a dozen fragmented
communities, using a patchwork of sometimes overlapping notational systems with …

Partially observable markov decision processes in robotics: A survey

M Lauri, D Hsu, J Pajarinen - IEEE Transactions on Robotics, 2022 - ieeexplore.ieee.org
Noisy sensing, imperfect control, and environment changes are defining characteristics of
many real-world robot tasks. The partially observable Markov decision process (POMDP) …

A definition of continual reinforcement learning

D Abel, A Barreto, B Van Roy… - Advances in …, 2024 - proceedings.neurips.cc
In a standard view of the reinforcement learning problem, an agent's goal is to efficiently
identify a policy that maximizes long-term reward. However, this perspective is based on a …

Varibad: A very good method for bayes-adaptive deep rl via meta-learning

L Zintgraf, K Shiarlis, M Igl, S Schulze, Y Gal… - arXiv preprint arXiv …, 2019 - arxiv.org
Trading off exploration and exploitation in an unknown environment is key to maximising
expected return during learning. A Bayes-optimal policy, which does so optimally, conditions …

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arXiv preprint arXiv:2110.05038, 2021 - arxiv.org
Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

Offline Meta Reinforcement Learning--Identifiability Challenges and Effective Data Collection Strategies

R Dorfman, I Shenfeld, A Tamar - Advances in Neural …, 2021 - proceedings.neurips.cc
Consider the following instance of the Offline Meta Reinforcement Learning (OMRL)
problem: given the complete training logs of $ N $ conventional RL agents, trained on $ N …

[图书][B] Deep learning in science

P Baldi - 2021 - books.google.com
This is the first rigorous, self-contained treatment of the theory of deep learning. Starting with
the foundations of the theory and building it up, this is essential reading for any scientists …

Bootstrap latent-predictive representations for multitask reinforcement learning

ZD Guo, BA Pires, B Piot, JB Grill… - International …, 2020 - proceedings.mlr.press
Learning a good representation is an essential component for deep reinforcement learning
(RL). Representation learning is especially important in multitask and partially observable …

Reinforcement learning: A survey

LP Kaelbling, ML Littman, AW Moore - Journal of artificial intelligence …, 1996 - jair.org
This paper surveys the field of reinforcement learning from a computer-science perspective.
It is written to be accessible to researchers familiar with machine learning. Both the historical …