Multi-path policy optimization

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

A survey on evolutionary reinforcement learning algorithms

Q Zhu, X Wu, Q Lin, L Ma, J Li, Z Ming, J Chen - Neurocomputing, 2023 - Elsevier

Reinforcement Learning (RL) has proven to be highly effective in various real-world
applications. However, in certain scenarios, Evolutionary Algorithms (EAs) have been …

被引用次数：28 相关文章所有 2 个版本

[PDF] arxiv.org

Reinforcement learning with dynamic boltzmann softmax updates

L Pan, Q Cai, Q Meng, W Chen, L Huang… - arXiv preprint arXiv …, 2019 - arxiv.org

Value function estimation is an important task in reinforcement learning, ie, prediction. The
Boltzmann softmax operator is a natural value estimator and can provide several benefits …

被引用次数：49 相关文章所有 7 个版本

UAV path planning based on the improved PPO algorithm

C Qi, C Wu, L Lei, X Li, P Cong - 2022 Asia Conference on …, 2022 - ieeexplore.ieee.org

In this paper, we consider the problem of unmanned aerial vehicle (UAV) path planning. The
traditional path planning algorithm has the problems of low efficiency and poor adaptability …

被引用次数：13 相关文章所有 2 个版本

Exploration in policy optimization through multiple paths

L Pan, Q Cai, L Huang - Autonomous Agents and Multi-Agent Systems, 2021 - Springer

Recent years have witnessed a tremendous improvement of deep reinforcement learning.
However, a challenging problem is that an agent may suffer from inefficient exploration …

被引用次数：2 相关文章所有 2 个版本

高级搜索

QQ 群