相关文章- 学术资源搜索

Hierarchical planning through goal-conditioned offline reinforcement learning

J Li, C Tang, M Tomizuka… - IEEE Robotics and …, 2022 - ieeexplore.ieee.org

Offline Reinforcement learning (RL) has shown potent in many safe-critical tasks in robotics
where exploration is risky and expensive. However, it still struggles to acquire skills in …

被引用次数：42 相关文章所有 4 个版本

[PDF] arxiv.org

Model-based offline planning

A Argenson, G Dulac-Arnold - arXiv preprint arXiv:2008.05556, 2020 - arxiv.org

Offline learning is a key part of making reinforcement learning (RL) useable in real systems.
Offline RL looks at scenarios where there is data from a system's operation, but no direct …

被引用次数：146 相关文章所有 3 个版本

[PDF] arxiv.org

Model-based offline planning with trajectory pruning

X Zhan, X Zhu, H Xu - arXiv preprint arXiv:2105.07351, 2021 - arxiv.org

The recent offline reinforcement learning (RL) studies have achieved much progress to
make RL usable in real-world systems by learning policies from pre-collected datasets …

被引用次数：26 相关文章所有 8 个版本

[PDF] aaai.org

Reinforcement learning for classical planning: Viewing heuristics as dense reward generators

C Gehring, M Asai, R Chitnis, T Silver… - Proceedings of the …, 2022 - ojs.aaai.org

Recent advances in reinforcement learning (RL) have led to a growing interest in applying
RL to classical planning domains or applying classical planning methods to some complex …

被引用次数：36 相关文章所有 10 个版本

[PDF] mlr.press

Fighting uncertainty with gradients: Offline reinforcement learning via diffusion score matching

HJT Suh, G Chou, H Dai, L Yang… - … on Robot Learning, 2023 - proceedings.mlr.press

Gradient-based methods enable efficient search capabilities in high dimensions. However,
in order to apply them effectively in offline optimization paradigms such as offline …

被引用次数：5 相关文章所有 4 个版本

Planning-augmented hierarchical reinforcement learning

R Gieselmann, FT Pokorny - IEEE Robotics and Automation …, 2021 - ieeexplore.ieee.org

Planning algorithms are powerful at solving long-horizon decision-making problems but
require that environment dynamics are known. Model-free reinforcement learning has …

被引用次数：23 相关文章所有 3 个版本

[PDF] mlr.press

Learning off-policy with online planning

H Sikchi, W Zhou, D Held - Conference on Robot Learning, 2022 - proceedings.mlr.press

Reinforcement learning (RL) in low-data and risk-sensitive domains requires performant and
flexible deployment policies that can readily incorporate constraints during deployment. One …

被引用次数：28 相关文章所有 8 个版本

[PDF] mlr.press

Latent plans for task-agnostic offline reinforcement learning

E Rosete-Beas, O Mees, G Kalweit… - … on Robot Learning, 2023 - proceedings.mlr.press

Everyday tasks of long-horizon and comprising a sequence of multiple implicit subtasks still
impose a major challenge in offline robot control. While a number of prior methods aimed to …

被引用次数：48 相关文章所有 6 个版本

[PDF] neurips.cc

Planning with goal-conditioned policies

S Nasiriany, V Pong, S Lin… - Advances in Neural …, 2019 - proceedings.neurips.cc

Planning methods can solve temporally extended sequential decision making problems by
composing simple behaviors. However, planning requires suitable abstractions for the states …

被引用次数：227 相关文章所有 9 个版本

[PDF] arxiv.org

Planning to practice: Efficient online fine-tuning by composing goals in latent space

K Fang, P Yin, A Nair, S Levine - 2022 IEEE/RSJ International …, 2022 - ieeexplore.ieee.org

General-purpose robots require diverse repertoires of behaviors to complete challenging
tasks in real-world unstructured environments. To address this issue, goal-conditioned …

被引用次数：17 相关文章所有 5 个版本

高级搜索

QQ 群

Hierarchical planning through goal-conditioned offline reinforcement learning

Model-based offline planning

Model-based offline planning with trajectory pruning

Reinforcement learning for classical planning: Viewing heuristics as dense reward generators

Fighting uncertainty with gradients: Offline reinforcement learning via diffusion score matching

Planning-augmented hierarchical reinforcement learning

Learning off-policy with online planning

Latent plans for task-agnostic offline reinforcement learning

Planning with goal-conditioned policies

Planning to practice: Efficient online fine-tuning by composing goals in latent space

相关搜索

引用