A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control

X Dong, J Zhang, L Cheng, WJ Xu, H Su… - Science China …, 2022 - Springer
The soft continuum arm has extensive application in industrial production and human life
due to its superior safety and flexibility. Reinforcement learning is a powerful technique for …

Mingling foresight with imagination: Model-based cooperative multi-agent reinforcement learning

Z Xu, B Zhang, Y Zhan, Y Baiia… - Advances in Neural …, 2022 - proceedings.neurips.cc
Recently, model-based agents have achieved better performance than model-free ones
using the same computational budget and training time in single-agent environments …

Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

D Palenicek, M Lutter, J Carvalho, J Peters - arXiv preprint arXiv …, 2023 - arxiv.org
Model-based reinforcement learning is one approach to increase sample efficiency.
However, the accuracy of the dynamics model and the resulting compounding error over …

Diminishing Return of Value Expansion Methods

D Palenicek, M Lutter, J Carvalho, D Dennert… - arXiv preprint arXiv …, 2024 - arxiv.org
Model-based reinforcement learning aims to increase sample efficiency, but the accuracy of
dynamics models and the resulting compounding errors are often seen as key limitations …

Revisiting Model-based Value Expansion

D Palenicek, M Lutter, J Peters - arXiv preprint arXiv:2203.14660, 2022 - arxiv.org
Model-based value expansion methods promise to improve the quality of value function
targets and, thereby, the effectiveness of value function learning. However, to date, these …