EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

K Murphy - arXiv preprint arXiv:2412.05265, 2024 - arxiv.org

This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …

Reinforcement Learning with Foundation Priors: Let Embodied Agent Efficiently Learn on Its Own

W Ye, Y Zhang, H Weng, X Gu, S Wang… - … Conference on Robot …, 2024 - openreview.net

Reinforcement learning (RL) is a promising approach for solving robotic manipulation tasks.
However, it is challenging to apply the RL algorithms directly in the real world. For one thing …

被引用次数：1 相关文章

[PDF] openreview.net

Foundation reinforcement learning: towards embodied generalist agents with foundation prior assistance

W Ye, Y Zhang, M Wang, S Wang, X Gu, P Abbeel… - 2023 - openreview.net

Recently, people have shown that large-scale pre-training from diverse internet-scale data is
the key to building a generalist model, as witnessed in the natural language processing …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Towards General-Purpose Model-Free Reinforcement Learning

S Fujimoto, P D'Oro, A Zhang, Y Tian… - arXiv preprint arXiv …, 2025 - arxiv.org

Reinforcement learning (RL) promises a framework for near-universal problem-solving. In
practice however, RL algorithms are often tailored to specific benchmarks, relying on …

[PDF] arxiv.org

Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control

M Nauman, M Ostaszewski, K Jankowski… - arXiv preprint arXiv …, 2024 - arxiv.org

Sample efficiency in Reinforcement Learning (RL) has traditionally been driven by
algorithmic enhancements. In this work, we demonstrate that scaling can also lead to …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

J Cheng, R Qiao, G Xiong, Q Miao, Y Ma, B Li… - arXiv preprint arXiv …, 2024 - arxiv.org

A significant aspiration of offline reinforcement learning (RL) is to develop a generalist agent
with high capabilities from large and heterogeneous datasets. However, prior approaches …

Parallelizing Model-based Reinforcement Learning Over the Sequence Length

ZR Wang, D Yue, J Long, Y Zhang - The Thirty-eighth Annual Conference … - openreview.net

Recently, Model-based Reinforcement Learning (MBRL) methods have demonstrated
stunning sample efficiency in various RL domains. However, achieving this extraordinary …

[PDF][PDF] General Tree Evaluation for AlphaZero

A Jaldevik - 2024 - repository.tudelft.nl

Over the last decade, there have been significant advances in model-based deep
reinforcement learning. One of the most successful such algorithms is AlphaZero which …

高级搜索

QQ 群