Sample complexity bounds for stochastic shortest path with a generative model

M Berg, M Feldmann, L Kirchner, T Kube - Neuroscience & Biobehavioral …, 2022 - Elsevier

Rumination is a widely recognized cognitive deviation in depression. Despite the
recognition, researchers have struggled to explain why patients cannot disengage from the …

被引用次数：20 相关文章所有 4 个版本

[PDF] mlr.press

Learning stochastic shortest path with linear function approximation

Y Min, J He, T Wang, Q Gu - International Conference on …, 2022 - proceedings.mlr.press

We study the stochastic shortest path (SSP) problem in reinforcement learning with linear
function approximation, where the transition kernel is represented as a linear mixture of …

被引用次数：33 相关文章所有 9 个版本

[PDF] neurips.cc

Stochastic shortest path: Minimax, parameter-free and towards horizon-free regret

J Tarbouriech, R Zhou, SS Du… - Advances in neural …, 2021 - proceedings.neurips.cc

We study the problem of learning in the stochastic shortest path (SSP) setting, where an
agent seeks to minimize the expected cost accumulated before reaching a goal state. We …

被引用次数：36 相关文章所有 13 个版本

[PDF] neurips.cc

Implicit finite-horizon approximation and efficient optimal algorithms for stochastic shortest path

L Chen, M Jafarnia-Jahromi… - Advances in Neural …, 2021 - proceedings.neurips.cc

We introduce a generic template for developing regret minimization algorithms in the
Stochastic Shortest Path (SSP) model, which achieves minimax optimal regret as long as …

被引用次数：26 相关文章所有 8 个版本

[PDF] neurips.cc

Policy optimization with linear temporal logic constraints

C Voloshin, H Le, S Chaudhuri… - Advances in Neural …, 2022 - proceedings.neurips.cc

We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints.
The language of LTL allows flexible description of tasks that may be unnatural to encode as …

被引用次数：21 相关文章所有 12 个版本

[PDF] arxiv.org

Online learning for stochastic shortest path model via posterior sampling

M Jafarnia-Jahromi, L Chen, R Jain, H Luo - arXiv preprint arXiv …, 2021 - arxiv.org

We consider the problem of online reinforcement learning for the Stochastic Shortest Path
(SSP) problem modeled as an unknown MDP with an absorbing state. We propose PSRL …

被引用次数：20 相关文章所有 3 个版本

[PDF] mlr.press

Posterior sampling-based online learning for the stochastic shortest path model

M Jafarnia-Jahromi, L Chen, R Jain… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press

We consider the problem of online reinforcement learning for the Stochastic Shortest Path
(SSP) problem modeled as an unknown MDP with an absorbing state. We propose PSRL …

被引用次数：2 相关文章所有 3 个版本

[PDF] neurips.cc

A provably efficient sample collection strategy for reinforcement learning

J Tarbouriech, M Pirotta, M Valko… - Advances in Neural …, 2021 - proceedings.neurips.cc

One of the challenges in online reinforcement learning (RL) is that the agent needs to trade
off the exploration of the environment and the exploitation of the samples to optimize its …

被引用次数：18 相关文章所有 12 个版本

[PDF] mlr.press

Identification of blackwell optimal policies for deterministic MDPs

V Boone, B Gaujal - International Conference on Artificial …, 2023 - proceedings.mlr.press

This paper investigates a new learning problem, the identification of Blackwell optimal
policies on deterministic MDPs (DMDPs): A learner has to return a Blackwell optimal policy …

被引用次数：3 相关文章所有 9 个版本

[PDF] mlr.press

Reaching goals is hard: Settling the sample complexity of the stochastic shortest path

L Chen, A Tirinzoni, M Pirotta… - … on Algorithmic Learning …, 2023 - proceedings.mlr.press

We study the sample complexity of learning an $\epsilon $-optimal policy in the Stochastic
Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner …

被引用次数：3 相关文章所有 3 个版本

高级搜索

QQ 群