Reinforcement Learning: An Overview

K Murphy - arXiv preprint arXiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …

Reinforcement Learning with Foundation Priors: Let Embodied Agent Efficiently Learn on Its Own

W Ye, Y Zhang, H Weng, X Gu, S Wang… - … Conference on Robot …, 2024 - openreview.net
Reinforcement learning (RL) is a promising approach for solving robotic manipulation tasks.
However, it is challenging to apply the RL algorithms directly in the real world. For one thing …

Foundation reinforcement learning: towards embodied generalist agents with foundation prior assistance

W Ye, Y Zhang, M Wang, S Wang, X Gu, P Abbeel… - 2023 - openreview.net
Recently, people have shown that large-scale pre-training from diverse internet-scale data is
the key to building a generalist model, as witnessed in the natural language processing …

Towards General-Purpose Model-Free Reinforcement Learning

S Fujimoto, P D'Oro, A Zhang, Y Tian… - arXiv preprint arXiv …, 2025 - arxiv.org
Reinforcement learning (RL) promises a framework for near-universal problem-solving. In
practice however, RL algorithms are often tailored to specific benchmarks, relying on …

Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control

M Nauman, M Ostaszewski, K Jankowski… - arXiv preprint arXiv …, 2024 - arxiv.org
Sample efficiency in Reinforcement Learning (RL) has traditionally been driven by
algorithmic enhancements. In this work, we demonstrate that scaling can also lead to …

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

J Cheng, R Qiao, G Xiong, Q Miao, Y Ma, B Li… - arXiv preprint arXiv …, 2024 - arxiv.org
A significant aspiration of offline reinforcement learning (RL) is to develop a generalist agent
with high capabilities from large and heterogeneous datasets. However, prior approaches …

Parallelizing Model-based Reinforcement Learning Over the Sequence Length

ZR Wang, D Yue, J Long, Y Zhang - The Thirty-eighth Annual Conference … - openreview.net
Recently, Model-based Reinforcement Learning (MBRL) methods have demonstrated
stunning sample efficiency in various RL domains. However, achieving this extraordinary …

[PDF][PDF] General Tree Evaluation for AlphaZero

A Jaldevik - 2024 - repository.tudelft.nl
Over the last decade, there have been significant advances in model-based deep
reinforcement learning. One of the most successful such algorithms is AlphaZero which …