A survey on model-based reinforcement learning

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024 - Springer
Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

Taskmet: Task-driven metric learning for model learning

D Bansal, RTQ Chen, M Mukadam… - Advances in Neural …, 2024 - proceedings.neurips.cc
Deep learning models are often used with some downstream task. Models solely trained to
achieve accurate predictions may struggle to perform well on the desired downstream tasks …

A unified framework for alternating offline model training and policy learning

S Yang, S Zhang, Y Feng… - Advances in Neural …, 2022 - proceedings.neurips.cc
In offline model-based reinforcement learning (offline MBRL), we learn a dynamic model
from historically collected data, and subsequently utilize the learned model and fixed …

Live in the moment: Learning dynamics model adapted to evolving policy

X Wang, W Wongkamjan, R Jia… - … on Machine Learning, 2023 - proceedings.mlr.press
Abstract Model-based reinforcement learning (RL) often achieves higher sample efficiency
in practice than model-free RL by learning a dynamics model to generate samples for policy …

Refining diffusion planner for reliable behavior synthesis by automatic detection of infeasible plans

K Lee, S Kim, J Choi - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Diffusion-based planning has shown promising results in long-horizon, sparse-reward tasks
by training trajectory diffusion models and conditioning the sampled trajectories using …

Distributional model equivalence for risk-sensitive reinforcement learning

T Kastner, MA Erdogdu… - Advances in Neural …, 2023 - proceedings.neurips.cc
We consider the problem of learning models for risk-sensitive reinforcement learning. We
theoretically demonstrate that proper value equivalence, a method of learning models which …

Deciding what to model: Value-equivalent sampling for reinforcement learning

D Arumugam, B Van Roy - Advances in neural information …, 2022 - proceedings.neurips.cc
The quintessential model-based reinforcement-learning agent iteratively refines its
estimates or prior beliefs about the true underlying model of the environment. Recent …

Approximate value equivalence

C Grimm, A Barreto, S Singh - Advances in neural …, 2022 - proceedings.neurips.cc
Abstract Model-based reinforcement learning agents must make compromises about which
aspects of the environment their models should capture. The value equivalence (VE) …

Bayesian reinforcement learning with limited cognitive load

D Arumugam, MK Ho, ND Goodman, B Van Roy - Open Mind, 2024 - direct.mit.edu
All biological and artificial agents must act given limits on their ability to acquire and process
information. As such, a general theory of adaptive behavior should be able to account for the …

Understanding world models through multi-step pruning policy via reinforcement learning

Z He, W Qiu, W Zhao, X Shao, Z Liu - Information Sciences, 2025 - Elsevier
In model-based reinforcement learning, the conventional approach to addressing world
model bias is to use gradient optimization methods. However, using a singular policy from …