An information-theoretic perspective on intrinsic motivation in reinforcement learning: A survey

A Aubret, L Matignon, S Hassas - Entropy, 2023 - mdpi.com
The reinforcement learning (RL) research area is very active, with an important number of
new contributions, especially considering the emergent field of deep RL (DRL). However, a …

Learning gflownets from partial episodes for improved convergence and stability

K Madan, J Rector-Brooks… - International …, 2023 - proceedings.mlr.press
Generative flow networks (GFlowNets) are a family of algorithms for training a sequential
sampler of discrete objects under an unnormalized target density and have been …

Convex reinforcement learning in finite trials

M Mutti, R De Santi, P De Bartolomeis… - Journal of Machine …, 2023 - jmlr.org
Convex Reinforcement Learning (RL) is a recently introduced framework that generalizes
the standard RL objective to any convex (or concave) function of the state distribution …

Thompson sampling for improved exploration in gflownets

J Rector-Brooks, K Madan, M Jain, M Korablyov… - arXiv preprint arXiv …, 2023 - arxiv.org
Generative flow networks (GFlowNets) are amortized variational inference algorithms that
treat sampling from a distribution over compositional objects as a sequential decision …

Curiosity-driven and victim-aware adversarial policies

C Gong, Z Yang, Y Bai, J Shi, A Sinha, B Xu… - Proceedings of the 38th …, 2022 - dl.acm.org
Recent years have witnessed great potential in applying Deep Reinforcement Learning
(DRL) in various challenging applications, such as autonomous driving, nuclear fusion …

Made: Exploration via maximizing deviation from explored regions

T Zhang, P Rashidinejad, J Jiao… - Advances in …, 2021 - proceedings.neurips.cc
In online reinforcement learning (RL), efficient exploration remains particularly challenging
in high-dimensional environments with sparse rewards. In low-dimensional environments …

Fast rates for maximum entropy exploration

D Tiapkin, D Belomestny… - International …, 2023 - proceedings.mlr.press
We address the challenge of exploration in reinforcement learning (RL) when the agent
operates in an unknown environment with sparse or no rewards. In this work, we study the …

Accelerating reinforcement learning with value-conditional state entropy exploration

D Kim, J Shin, P Abbeel, Y Seo - Advances in Neural …, 2024 - proceedings.neurips.cc
A promising technique for exploration is to maximize the entropy of visited state distribution,
ie, state entropy, by encouraging uniform coverage of visited state space. While it has been …

[PDF][PDF] Structure in reinforcement learning: A survey and open problems

A Mohan, A Zhang, M Lindauer - arXiv preprint arXiv:2306.16021, 2023 - academia.edu
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural
Networks (DNNs) for function approximation, has demonstrated considerable success in …

Cem: Constrained entropy maximization for task-agnostic safe exploration

Q Yang, MTJ Spaan - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
In the absence of assigned tasks, a learning agent typically seeks to explore its environment
efficiently. However, the pursuit of exploration will bring more safety risks. An under-explored …