Transfer learning in deep reinforcement learning: A survey

Z Zhu, K Lin, AK Jain, J Zhou - IEEE Transactions on Pattern …, 2023 - ieeexplore.ieee.org
Reinforcement learning is a learning paradigm for solving sequential decision-making
problems. Recent years have witnessed remarkable progress in reinforcement learning …

知识和数据协同驱动的群体智能决策方法研究综述

蒲志强, 易建强, 刘振, 丘腾海, 孙金林, 李非墨 - 自动化学报, 2022 - aas.net.cn
群体智能(Collectire intelligence, CI) 系统具有广泛的应用前景. 当前的群体智能决策方法主要
包括知识驱动, 数据驱动两大类, 但各自存在优缺点. 本文指出, 知识与数据协同驱动将为群体 …

Learning to utilize shaping rewards: A new approach of reward shaping

Y Hu, W Wang, H Jia, Y Wang… - Advances in …, 2020 - proceedings.neurips.cc
Reward shaping is an effective technique for incorporating domain knowledge into
reinforcement learning (RL). Existing approaches such as potential-based reward shaping …

Autonomous navigation of UAVs in large-scale complex environments: A deep reinforcement learning approach

C Wang, J Wang, Y Shen… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
In this paper, we propose a deep reinforcement learning (DRL)-based method that allows
unmanned aerial vehicles (UAVs) to execute navigation tasks in large-scale complex …

Rudder: Return decomposition for delayed rewards

JA Arjona-Medina, M Gillhofer… - Advances in …, 2019 - proceedings.neurips.cc
We propose RUDDER, a novel reinforcement learning approach for delayed rewards in
finite Markov decision processes (MDPs). In MDPs the Q-values are equal to the expected …

Deep reinforcement learning and reward shaping based eco-driving control for automated HEVs among signalized intersections

J Li, X Wu, M Xu, Y Liu - Energy, 2022 - Elsevier
In a connected traffic environment with signalized intersections, eco-driving control needs to
co-optimize fuel economy (fuel consumption), driving safety (collisions and red lights), and …

Human-centered reinforcement learning: A survey

G Li, R Gomez, K Nakamura… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Human-centered reinforcement learning (RL), in which an agent learns how to perform a
task from evaluative feedback delivered by a human observer, has become more and more …

State abstractions for lifelong reinforcement learning

D Abel, D Arumugam, L Lehnert… - … on Machine Learning, 2018 - proceedings.mlr.press
In lifelong reinforcement learning, agents must effectively transfer knowledge across tasks
while simultaneously addressing exploration, credit assignment, and generalization. State …

Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

BJ De Moor, J Gijsbrechts, RN Boute - European Journal of Operational …, 2022 - Elsevier
Deep reinforcement learning (DRL) has proven to be an effective, general-purpose
technology to develop 'good'replenishment policies in inventory management. We show …

What can learned intrinsic rewards capture?

Z Zheng, J Oh, M Hessel, Z Xu… - International …, 2020 - proceedings.mlr.press
The objective of a reinforcement learning agent is to behave so as to maximise the sum of a
suitable scalar function of state: the reward. These rewards are typically given and …