A review of reinforcement learning based intelligent optimization for manufacturing scheduling

L Wang, Z Pan, J Wang - Complex System Modeling and …, 2021 - ieeexplore.ieee.org
As the critical component of manufacturing systems, production scheduling aims to optimize
objectives in terms of profit, efficiency, and energy consumption by reasonably determining …

Reinforcement learning in sustainable energy and electric systems: A survey

T Yang, L Zhao, W Li, AY Zomaya - Annual Reviews in Control, 2020 - Elsevier
The dynamic nature of sustainable energy and electric systems can vary significantly along
with the environment and load change, and they represent the features of multivariate, high …

Deterministic policy gradient algorithms

D Silver, G Lever, N Heess, T Degris… - International …, 2014 - proceedings.mlr.press
In this paper we consider deterministic policy gradient algorithms for reinforcement learning
with continuous actions. The deterministic policy gradient has a particularly appealing form …

In-network machine learning using programmable network devices: A survey

C Zheng, X Hong, D Ding, S Vargaftik… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Machine learning is widely used to solve networking challenges, ranging from traffic
classification and anomaly detection to network configuration. However, machine learning …

深度强化学习算法与应用研究现状综述

刘朝阳, 穆朝絮, 孙长银 - 智能科学与技术学报, 2020 - infocomm-journal.com
深度强化学习主要被用来处理感知− 决策问题, 已经成为人工智能领域重要的研究分支.
概述了基于值函数和策略梯度的两类深度强化学习算法, 详细阐述了深度Q 网络 …

[PDF][PDF] 融合LSTM 和PPO 算法的移动机器人视觉导航

张仪, 冯伟, 王卫军, 杨之乐, 张艳辉… - 电子测量与仪器 …, 2023 - jemi.cnjournals.com
为提高移动机器人在无地图情况下的视觉导航能力, 提升导航成功率, 提出了一种融合长短期
记忆神经网络(long short term memory, LSTM) 和近端策略优化算法(proximal policy …

DM-DQN: Dueling Munchausen deep Q network for robot path planning

Y Gu, Z Zhu, J Lv, L Shi, Z Hou, S Xu - Complex & Intelligent Systems, 2023 - Springer
In order to achieve collision-free path planning in complex environment, Munchausen deep
Q-learning network (M-DQN) is applied to mobile robot to learn the best decision. On the …

Deep reinforcement learning-based radar network target assignment

F Meng, K Tian, C Wu - IEEE sensors journal, 2021 - ieeexplore.ieee.org
This study focuses on the problem of target assignment when a phased-array radar network
detects hypersonic-glide vehicles in near-space and proposes a method for target …

深度强化学习及在路径规划中的研究进展.

张荣霞, 武长旭, 孙同超… - Journal of Computer …, 2021 - search.ebscohost.com
路径规划的目的是让机器人在移动过程中既能避开障碍物, 又能快速规划出最短路径.
在分析基于强化学习的路径规划算法优缺点的基础上, 引出能够在复杂动态环境下进行良好路径 …

A deep deterministic policy gradient algorithm based on averaged state-action estimation

J Xu, H Zhang, J Qiu - Computers and Electrical Engineering, 2022 - Elsevier
Abstract Deep Reinforcement Learning (DRL), one of the most popular research topics in
artificial intelligence, has achieved a breakthrough in continuous control tasks. Nonetheless …