Model-based offline planning

M Janner, Y Du, JB Tenenbaum, S Levine - arXiv preprint arXiv …, 2022 - arxiv.org

Model-based reinforcement learning methods often use learning only for the purpose of
estimating an approximate dynamics model, offloading the rest of the decision-making work …

被引用次数：385 相关文章所有 4 个版本

[PDF] neurips.cc

Offline reinforcement learning as one big sequence modeling problem

M Janner, Q Li, S Levine - Advances in neural information …, 2021 - proceedings.neurips.cc

Reinforcement learning (RL) is typically viewed as the problem of estimating single-step
policies (for model-free RL) or single-step models (for model-based RL), leveraging the …

被引用次数：650 相关文章所有 9 个版本

[PDF] neurips.cc

Combo: Conservative offline model-based policy optimization

T Yu, A Kumar, R Rafailov… - Advances in neural …, 2021 - proceedings.neurips.cc

Abstract Model-based reinforcement learning (RL) algorithms, which learn a dynamics
model from logged experience and perform conservative planning under the learned model …

被引用次数：370 相关文章所有 7 个版本

[PDF] mlr.press

Offline reinforcement learning with fisher divergence critic regularization

I Kostrikov, R Fergus, J Tompson… - … on Machine Learning, 2021 - proceedings.mlr.press

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior
regularization, typically augmenting a model-free actor critic algorithm with a penalty …

被引用次数：288 相关文章所有 5 个版本

[PDF] neurips.cc

Rambo-rl: Robust adversarial model-based offline reinforcement learning

M Rigter, B Lacerda, N Hawes - Advances in neural …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) aims to find performant policies from logged data without
further environment interaction. Model-based algorithms, which learn a model of the …

被引用次数：103 相关文章所有 7 个版本

[PDF] arxiv.org

Rvs: What is essential for offline rl via supervised learning?

S Emmons, B Eysenbach, I Kostrikov… - arXiv preprint arXiv …, 2021 - arxiv.org

Recent work has shown that supervised learning alone, without temporal difference (TD)
learning, can be remarkably effective for offline RL. When does this hold true, and which …

被引用次数：181 相关文章所有 4 个版本

[PDF] arxiv.org

Temporal difference learning for model predictive control

N Hansen, X Wang, H Su - arXiv preprint arXiv:2203.04955, 2022 - arxiv.org

Data-driven model predictive control has two key advantages over model-free methods: a
potential for improved sample efficiency through model learning, and better performance as …

被引用次数：150 相关文章所有 10 个版本

[HTML] springer.com

[HTML][HTML] Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

G Dulac-Arnold, N Levine, DJ Mankowitz, J Li… - Machine Learning, 2021 - Springer

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is
beginning to show some successes in real-world scenarios. However, much of the research …

被引用次数：443 相关文章所有 6 个版本

[PDF] thecvf.com

Driving into the future: Multiview visual forecasting and planning with world model for autonomous driving

Y Wang, J He, L Fan, H Li, Y Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com

In autonomous driving predicting future events in advance and evaluating the foreseeable
risks empowers autonomous vehicles to plan their actions enhancing safety and efficiency …

被引用次数：21 相关文章所有 3 个版本

[PDF] neurips.cc

Elastic decision transformer

YH Wu, X Wang, M Hamaya - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract This paper introduces Elastic Decision Transformer (EDT), a significant
advancement over the existing Decision Transformer (DT) and its variants. Although DT …

被引用次数：21 相关文章所有 7 个版本

高级搜索

QQ 群

Planning with diffusion for flexible behavior synthesis

Offline reinforcement learning as one big sequence modeling problem

Combo: Conservative offline model-based policy optimization

Offline reinforcement learning with fisher divergence critic regularization

Rambo-rl: Robust adversarial model-based offline reinforcement learning

Rvs: What is essential for offline rl via supervised learning?

Temporal difference learning for model predictive control

[HTML][HTML] Challenges of real-world reinforcement learning: definitions, benchmarks and analysis

Driving into the future: Multiview visual forecasting and planning with world model for autonomous driving

Elastic decision transformer

引用