Survey of deep reinforcement learning based on value function and policy gradient

L Wang, Z Pan, J Wang - Complex System Modeling and …, 2021 - ieeexplore.ieee.org

As the critical component of manufacturing systems, production scheduling aims to optimize
objectives in terms of profit, efficiency, and energy consumption by reasonably determining …

被引用次数：150 相关文章所有 3 个版本

Reinforcement learning in sustainable energy and electric systems: A survey

T Yang, L Zhao, W Li, AY Zomaya - Annual Reviews in Control, 2020 - Elsevier

The dynamic nature of sustainable energy and electric systems can vary significantly along
with the environment and load change, and they represent the features of multivariate, high …

被引用次数：198 相关文章

[PDF] mlr.press

Deterministic policy gradient algorithms

D Silver, G Lever, N Heess, T Degris… - International …, 2014 - proceedings.mlr.press

In this paper we consider deterministic policy gradient algorithms for reinforcement learning
with continuous actions. The deterministic policy gradient has a particularly appealing form …

被引用次数：5125 相关文章所有 32 个版本

[PDF] ox.ac.uk

In-network machine learning using programmable network devices: A survey

C Zheng, X Hong, D Ding, S Vargaftik… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

Machine learning is widely used to solve networking challenges, ranging from traffic
classification and anomaly detection to network configuration. However, machine learning …

被引用次数：15 相关文章所有 4 个版本

[PDF] infocomm-journal.com

深度强化学习算法与应用研究现状综述

刘朝阳，穆朝絮，孙长银 - 智能科学与技术学报, 2020 - infocomm-journal.com

深度强化学习主要被用来处理感知− 决策问题, 已经成为人工智能领域重要的研究分支.
概述了基于值函数和策略梯度的两类深度强化学习算法, 详细阐述了深度Q 网络 …

被引用次数：17 相关文章所有 3 个版本

[PDF] cnjournals.com

[PDF][PDF] 融合LSTM 和PPO 算法的移动机器人视觉导航

张仪，冯伟，王卫军，杨之乐，张艳辉… - 电子测量与仪器 …, 2023 - jemi.cnjournals.com

为提高移动机器人在无地图情况下的视觉导航能力, 提升导航成功率, 提出了一种融合长短期
记忆神经网络(long short term memory, LSTM) 和近端策略优化算法(proximal policy …

被引用次数：7 相关文章所有 4 个版本

[PDF] springer.com

DM-DQN: Dueling Munchausen deep Q network for robot path planning

Y Gu, Z Zhu, J Lv, L Shi, Z Hou, S Xu - Complex & Intelligent Systems, 2023 - Springer

In order to achieve collision-free path planning in complex environment, Munchausen deep
Q-learning network (M-DQN) is applied to mobile robot to learn the best decision. On the …

被引用次数：16 相关文章所有 4 个版本

Deep reinforcement learning-based radar network target assignment

F Meng, K Tian, C Wu - IEEE sensors journal, 2021 - ieeexplore.ieee.org

This study focuses on the problem of target assignment when a phased-array radar network
detects hypersonic-glide vehicles in near-space and proposes a method for target …

被引用次数：31 相关文章所有 2 个版本

深度强化学习及在路径规划中的研究进展.

张荣霞，武长旭，孙同超… - Journal of Computer …, 2021 - search.ebscohost.com

路径规划的目的是让机器人在移动过程中既能避开障碍物, 又能快速规划出最短路径.
在分析基于强化学习的路径规划算法优缺点的基础上, 引出能够在复杂动态环境下进行良好路径 …

被引用次数：8 相关文章所有 2 个版本

A deep deterministic policy gradient algorithm based on averaged state-action estimation

J Xu, H Zhang, J Qiu - Computers and Electrical Engineering, 2022 - Elsevier

Abstract Deep Reinforcement Learning (DRL), one of the most popular research topics in
artificial intelligence, has achieved a breakthrough in continuous control tasks. Nonetheless …

被引用次数：11 相关文章所有 2 个版本

高级搜索

QQ 群