On the importance of exploration for generalization in reinforcement learning

Y Jiang, JZ Kolter, R Raileanu - Advances in Neural …, 2024 - proceedings.neurips.cc
Existing approaches for improving generalization in deep reinforcement learning (RL) have
mostly focused on representation learning, neglecting RL-specific aspects such as …

Adaptive reward-free exploration

E Kaufmann, P Ménard… - Algorithmic …, 2021 - proceedings.mlr.press
Reward-free exploration is a reinforcement learning setting recently studied by (Jin et al.
2020), who address it by running several algorithms with regret guarantees in parallel. In our …

A study of global and episodic bonuses for exploration in contextual mdps

M Henaff, M Jiang, R Raileanu - International Conference on …, 2023 - proceedings.mlr.press
Exploration in environments which differ across episodes has received increasing attention
in recent years. Current methods use some combination of global novelty bonuses …

Learning stochastic shortest path with linear function approximation

Y Min, J He, T Wang, Q Gu - International Conference on …, 2022 - proceedings.mlr.press
We study the stochastic shortest path (SSP) problem in reinforcement learning with linear
function approximation, where the transition kernel is represented as a linear mixture of …

Geometric entropic exploration

ZD Guo, MG Azar, A Saade, S Thakoor, B Piot… - arXiv preprint arXiv …, 2021 - arxiv.org
Exploration is essential for solving complex Reinforcement Learning (RL) tasks. Maximum
State-Visitation Entropy (MSVE) formulates the exploration problem as a well-defined policy …

Near-optimal regret bounds for stochastic shortest path

A Rosenberg, A Cohen, Y Mansour… - … on Machine Learning, 2020 - proceedings.mlr.press
Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an
agent has to reach a goal state in minimum total expected cost. In the learning formulation of …

Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret

J Tarbouriech, R Zhou, SS Du… - Advances in neural …, 2021 - proceedings.neurips.cc
We study the problem of learning in the stochastic shortest path (SSP) setting, where an
agent seeks to minimize the expected cost accumulated before reaching a goal state. We …

Minimax regret for stochastic shortest path with adversarial costs and known transition

L Chen, H Luo, CY Wei - Conference on Learning Theory, 2021 - proceedings.mlr.press
We study the stochastic shortest path problem with adversarial costs and known transition,
and show that the minimax regret is $ O (\sqrt {DT_\star K}) $ and $ O (\sqrt {DT_\star SA K}) …

Finding the stochastic shortest path with low regret: The adversarial cost and unknown transition case

L Chen, H Luo - International Conference on Machine …, 2021 - proceedings.mlr.press
We make significant progress toward the stochastic shortest path problem with adversarial
costs and unknown transition. Specifically, we develop algorithms that achieve $ O (\sqrt {S …

Improved no-regret algorithms for stochastic shortest path with linear mdp

L Chen, R Jain, H Luo - International Conference on …, 2022 - proceedings.mlr.press
We introduce two new no-regret algorithms for the stochastic shortest path (SSP) problem
with a linear MDP that significantly improve over the only existing results of (Vial et al …