Byol-explore: Exploration by bootstrapped prediction

Z Guo, S Thakoor, M Pîslar… - Advances in neural …, 2022 - proceedings.neurips.cc
We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven
exploration in visually complex environments. BYOL-Explore learns the world …

Reward-free exploration for reinforcement learning

C Jin, A Krishnamurthy… - … on Machine Learning, 2020 - proceedings.mlr.press
Exploration is widely regarded as one of the most challenging aspects of reinforcement
learning (RL), with many naive approaches succumbing to exponential sample complexity …

Convex reinforcement learning in finite trials

M Mutti, R De Santi, P De Bartolomeis… - Journal of Machine …, 2023 - jmlr.org
Convex Reinforcement Learning (RL) is a recently introduced framework that generalizes
the standard RL objective to any convex (or concave) function of the state distribution …

Maximum state entropy exploration using predecessor and successor representations

AK Jain, L Lehnert, I Rish… - Advances in Neural …, 2024 - proceedings.neurips.cc
Animals have a developed ability to explore that aids them in important tasks such as
locating food, exploring for shelter, and finding misplaced items. These exploration skills …

Cem: Constrained entropy maximization for task-agnostic safe exploration

Q Yang, MTJ Spaan - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
In the absence of assigned tasks, a learning agent typically seeks to explore its environment
efficiently. However, the pursuit of exploration will bring more safety risks. An under-explored …

Constrained episodic reinforcement learning in concave-convex and knapsack settings

K Brantley, M Dudik, T Lykouris… - Advances in …, 2020 - proceedings.neurips.cc
We propose an algorithm for tabular episodic reinforcement learning with constraints. We
provide a modular analysis with strong theoretical guarantees for settings with concave …

Submodular reinforcement learning

M Prajapat, M Mutný, MN Zeilinger… - arXiv preprint arXiv …, 2023 - arxiv.org
In reinforcement learning (RL), rewards of states are typically considered additive, and
following the Markov assumption, they are $\textit {independent} $ of states visited …

Regret guarantees for model-based reinforcement learning with long-term average constraints

M Agarwal, Q Bai, V Aggarwal - Uncertainty in Artificial …, 2022 - proceedings.mlr.press
We consider the problem of constrained Markov Decision Process (CMDP) where an agent
interacts with an ergodic Markov Decision Process. At every interaction, the agent obtains a …

Active exploration via experiment design in markov chains

M Mutny, T Janik, A Krause - International Conference on …, 2023 - proceedings.mlr.press
A key challenge in science and engineering is to design experiments to learn about some
unknown quantity of interest. Classical experimental design optimally allocates the …

The importance of non-markovianity in maximum state entropy exploration

M Mutti, R De Santi, M Restelli - International Conference on …, 2022 - proceedings.mlr.press
In the maximum state entropy exploration framework, an agent interacts with a reward-free
environment to learn a policy that maximizes the entropy of the expected state visitations it is …