深度强化学习及在路径规划中的研究进展.

张荣霞, 武长旭, 孙同超… - Journal of Computer …, 2021 - search.ebscohost.com
路径规划的目的是让机器人在移动过程中既能避开障碍物, 又能快速规划出最短路径.
在分析基于强化学习的路径规划算法优缺点的基础上, 引出能够在复杂动态环境下进行良好路径 …

On the effect of auxiliary tasks on representation dynamics

C Lyle, M Rowland, G Ostrovski… - International …, 2021 - proceedings.mlr.press
While auxiliary tasks play a key role in shaping the representations learnt by reinforcement
learning agents, much is still unknown about the mechanisms through which this is …

[图书][B] Distributional reinforcement learning

MG Bellemare, W Dabney, M Rowland - 2023 - books.google.com
The first comprehensive guide to distributional reinforcement learning, providing a new
mathematical formalism for thinking about decisions from a probabilistic perspective …

Stock market prediction using deep reinforcement learning

AL Awad, SM Elkaffas, MW Fakhr - Applied System Innovation, 2023 - mdpi.com
Stock value prediction and trading, a captivating and complex research domain, continues to
draw heightened attention. Ensuring profitable returns in stock market investments demands …

Multimodal deep reinforcement learning with auxiliary task for obstacle avoidance of indoor mobile robot

H Song, A Li, T Wang, M Wang - Sensors, 2021 - mdpi.com
It is an essential capability of indoor mobile robots to avoid various kinds of obstacles.
Recently, multimodal deep reinforcement learning (DRL) methods have demonstrated great …

Explainability via causal self-talk

NA Roy, J Kim, N Rabinowitz - Advances in Neural …, 2022 - proceedings.neurips.cc
Explaining the behavior of AI systems is an important problem that, in practice, is generally
avoided. While the XAI community has been developing an abundance of techniques, most …

A concise review of intelligent game agent

H Li, X Pang, B Sun, K Liu - Entertainment Computing, 2024 - Elsevier
Intelligent game agents are crafted using AI technologies to mimic player behavior and
make decisions autonomously. Over the past decades, the scope of intelligent agents has …

Learning-based DoS attack power allocation in multiprocess systems

M Huang, K Ding, S Dey, Y Li… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
We study the denial-of-service (DoS) attack power allocation optimization in a multiprocess
cyber–physical system (CPS), where sensors observe different dynamic processes and …

Learning value functions in deep policy gradients using residual variance

Y Flet-Berliac, R Ouhamma, OA Maillard… - arXiv preprint arXiv …, 2020 - arxiv.org
Policy gradient algorithms have proven to be successful in diverse decision making and
control tasks. However, these methods suffer from high sample complexity and instability …

Learning semantic-agnostic and spatial-aware representation for generalizable visual-audio navigation

H Wang, Y Wang, F Zhong, M Wu… - IEEE Robotics and …, 2023 - ieeexplore.ieee.org
Visual-audio navigation (VAN) is attracting more and more attention from the robotic
community due to its broad applications, eg, household robots and rescue robots. In this …