Value gradient weighted model-based reinforcement learning

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024 - Springer

Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

被引用次数：103 相关文章所有 4 个版本

[PDF] neurips.cc

Taskmet: Task-driven metric learning for model learning

D Bansal, RTQ Chen, M Mukadam… - Advances in Neural …, 2024 - proceedings.neurips.cc

Deep learning models are often used with some downstream task. Models solely trained to
achieve accurate predictions may struggle to perform well on the desired downstream tasks …

被引用次数：10 相关文章所有 6 个版本

[PDF] neurips.cc

A unified framework for alternating offline model training and policy learning

S Yang, S Zhang, Y Feng… - Advances in Neural …, 2022 - proceedings.neurips.cc

In offline model-based reinforcement learning (offline MBRL), we learn a dynamic model
from historically collected data, and subsequently utilize the learned model and fixed …

被引用次数：13 相关文章所有 8 个版本

[PDF] mlr.press

Live in the moment: Learning dynamics model adapted to evolving policy

X Wang, W Wongkamjan, R Jia… - … on Machine Learning, 2023 - proceedings.mlr.press

Abstract Model-based reinforcement learning (RL) often achieves higher sample efficiency
in practice than model-free RL by learning a dynamics model to generate samples for policy …

被引用次数：17 相关文章所有 8 个版本

[PDF] neurips.cc

Refining diffusion planner for reliable behavior synthesis by automatic detection of infeasible plans

K Lee, S Kim, J Choi - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Diffusion-based planning has shown promising results in long-horizon, sparse-reward tasks
by training trajectory diffusion models and conditioning the sampled trajectories using …

被引用次数：6 相关文章所有 5 个版本

[PDF] neurips.cc

Distributional model equivalence for risk-sensitive reinforcement learning

T Kastner, MA Erdogdu… - Advances in Neural …, 2023 - proceedings.neurips.cc

We consider the problem of learning models for risk-sensitive reinforcement learning. We
theoretically demonstrate that proper value equivalence, a method of learning models which …

被引用次数：8 相关文章所有 10 个版本

[PDF] neurips.cc

Deciding what to model: Value-equivalent sampling for reinforcement learning

D Arumugam, B Van Roy - Advances in neural information …, 2022 - proceedings.neurips.cc

The quintessential model-based reinforcement-learning agent iteratively refines its
estimates or prior beliefs about the true underlying model of the environment. Recent …

被引用次数：15 相关文章所有 7 个版本

[PDF] neurips.cc

Approximate value equivalence

C Grimm, A Barreto, S Singh - Advances in neural …, 2022 - proceedings.neurips.cc

Abstract Model-based reinforcement learning agents must make compromises about which
aspects of the environment their models should capture. The value equivalence (VE) …

被引用次数：8 相关文章所有 4 个版本

[PDF] mit.edu

Bayesian reinforcement learning with limited cognitive load

D Arumugam, MK Ho, ND Goodman, B Van Roy - Open Mind, 2024 - direct.mit.edu

All biological and artificial agents must act given limits on their ability to acquire and process
information. As such, a general theory of adaptive behavior should be able to account for the …

被引用次数：10 相关文章所有 7 个版本

Understanding world models through multi-step pruning policy via reinforcement learning

Z He, W Qiu, W Zhao, X Shao, Z Liu - Information Sciences, 2025 - Elsevier

In model-based reinforcement learning, the conventional approach to addressing world
model bias is to use gradient optimization methods. However, using a singular policy from …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群