A survey on reinforcement learning methods in character animation

A Kwiatkowski, E Alvarado, V Kalogeiton… - Computer Graphics …, 2022 - Wiley Online Library
Reinforcement Learning is an area of Machine Learning focused on how agents can be
trained to make sequential decisions, and achieve a particular goal within an arbitrary …

A distributional code for value in dopamine-based reinforcement learning

W Dabney, Z Kurth-Nelson, N Uchida, CK Starkweather… - Nature, 2020 - nature.com
Since its introduction, the reward prediction error theory of dopamine has explained a wealth
of empirical phenomena, providing a unifying framework for understanding the …

First return, then explore

A Ecoffet, J Huizinga, J Lehman, KO Stanley, J Clune - Nature, 2021 - nature.com
Reinforcement learning promises to solve complex sequential-decision problems
autonomously by specifying a high-level reward function only. However, reinforcement …

Dopamine transients follow a striatal gradient of reward time horizons

A Mohebi, W Wei, L Pelattini, K Kim, JD Berke - Nature Neuroscience, 2024 - nature.com
Animals make predictions to guide their behavior and update those predictions through
experience. Transient increases in dopamine (DA) are thought to be critical signals for …

On the expressivity of markov reward

D Abel, W Dabney, A Harutyunyan… - Advances in …, 2021 - proceedings.neurips.cc
Reward is the driving force for reinforcement-learning agents. This paper is dedicated to
understanding the expressivity of reward as a way to capture tasks that we would want an …

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arXiv preprint arXiv:2110.05038, 2021 - arxiv.org
Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

Settling the reward hypothesis

M Bowling, JD Martin, D Abel… - … on Machine Learning, 2023 - proceedings.mlr.press
The reward hypothesis posits that," all of what we mean by goals and purposes can be well
thought of as maximization of the expected value of the cumulative sum of a received scalar …

On the effect of auxiliary tasks on representation dynamics

C Lyle, M Rowland, G Ostrovski… - International …, 2021 - proceedings.mlr.press
While auxiliary tasks play a key role in shaping the representations learnt by reinforcement
learning agents, much is still unknown about the mechanisms through which this is …

[图书][B] Distributional reinforcement learning

MG Bellemare, W Dabney, M Rowland - 2023 - books.google.com
The first comprehensive guide to distributional reinforcement learning, providing a new
mathematical formalism for thinking about decisions from a probabilistic perspective …

A self-tuning actor-critic algorithm

T Zahavy, Z Xu, V Veeriah, M Hessel… - Advances in neural …, 2020 - proceedings.neurips.cc
Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters,
typically requiring significant manual effort to identify hyperparameters that perform well on a …