Improved sample complexity for incremental autonomous exploration in mdps

D Tiapkin, D Belomestny… - International …, 2023 - proceedings.mlr.press

We address the challenge of exploration in reinforcement learning (RL) when the agent
operates in an unknown environment with sparse or no rewards. In this work, we study the …

被引用次数：15 相关文章所有 9 个版本

[PDF] mlr.press

Learning stochastic shortest path with linear function approximation

Y Min, J He, T Wang, Q Gu - International Conference on …, 2022 - proceedings.mlr.press

We study the stochastic shortest path (SSP) problem in reinforcement learning with linear
function approximation, where the transition kernel is represented as a linear mixture of …

被引用次数：33 相关文章所有 9 个版本

[PDF] neurips.cc

Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret

J Tarbouriech, R Zhou, SS Du… - Advances in neural …, 2021 - proceedings.neurips.cc

We study the problem of learning in the stochastic shortest path (SSP) setting, where an
agent seeks to minimize the expected cost accumulated before reaching a goal state. We …

被引用次数：36 相关文章所有 13 个版本

[PDF] neurips.cc

Implicit finite-horizon approximation and efficient optimal algorithms for stochastic shortest path

L Chen, M Jafarnia-Jahromi… - Advances in Neural …, 2021 - proceedings.neurips.cc

We introduce a generic template for developing regret minimization algorithms in the
Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as …

被引用次数：26 相关文章所有 8 个版本

[PDF] neurips.cc

A provably efficient sample collection strategy for reinforcement learning

J Tarbouriech, M Pirotta, M Valko… - Advances in Neural …, 2021 - proceedings.neurips.cc

One of the challenges in online reinforcement learning (RL) is that the agent needs to trade
off the exploration of the environment and the exploitation of the samples to optimize its …

被引用次数：18 相关文章所有 12 个版本

[PDF] mlr.press

Sample complexity bounds for stochastic shortest path with a generative model

J Tarbouriech, M Pirotta, M Valko… - Algorithmic Learning …, 2021 - proceedings.mlr.press

We consider the objective of computing an $\epsilon $-optimal policy in a stochastic shortest
path (SSP) setting, provided that we can access a generative sampling oracle. We propose …

被引用次数：17 相关文章所有 10 个版本

[PDF] mlr.press

Reaching goals is hard: Settling the sample complexity of the stochastic shortest path

L Chen, A Tirinzoni, M Pirotta… - … on Algorithmic Learning …, 2023 - proceedings.mlr.press

We study the sample complexity of learning an $\epsilon $-optimal policy in the Stochastic
Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner …

被引用次数：3 相关文章所有 3 个版本

[PDF] mlr.press

Adaptive multi-goal exploration

J Tarbouriech, OD Domingues… - International …, 2022 - proceedings.mlr.press

We introduce a generic strategy for provably efficient multi-goal exploration. It relies on
AdaGoal, a novel goal selection scheme that leverages a measure of uncertainty in reaching …

被引用次数：4 相关文章所有 4 个版本

[PDF] mlr.press

Near-optimal algorithms for autonomous exploration and multi-goal stochastic shortest path

H Cai, T Ma, S Du - International Conference on Machine …, 2022 - proceedings.mlr.press

We revisit the incremental autonomous exploration problem proposed by Lim and Auer
(2012). In this setting, the agent aims to learn a set of near-optimal goal-conditioned policies …

被引用次数：4 相关文章所有 6 个版本

[PDF] mlr.press

Layered state discovery for incremental autonomous exploration

L Chen, A Tirinzoni, A Lazaric… - … Conference on Machine …, 2023 - proceedings.mlr.press

We study the autonomous exploration (AX) problem proposed by Lim & Auer (2012). In this
setting, the objective is to discover a set of $\epsilon $-optimal policies reaching a set …

高级搜索

QQ 群