A survey of meta-reinforcement learning

J Beck, R Vuorio, EZ Liu, Z Xiong, L Zintgraf… - arXiv preprint arXiv …, 2023 - arxiv.org
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

Meta-reinforcement learning based on self-supervised task representation learning

M Wang, Z Bing, X Yao, S Wang, H Kai, H Su… - Proceedings of the …, 2023 - ojs.aaai.org
Meta-reinforcement learning enables artificial agents to learn from related training tasks and
adapt to new tasks efficiently with minimal interaction data. However, most existing research …

Analyzing Generalization in Policy Networks: A Case Study with the Double-Integrator System

R Zhang, H Han, M Lv, Q Yang, J Cheng - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Extensive utilization of deep reinforcement learning (DRL) policy networks in diverse
continuous control tasks has raised questions regarding performance degradation in …

A Survey of Reinforcement Learning for Optimization in Automation

A Farooq, K Iqbal - 2024 IEEE 20th International Conference on …, 2024 - ieeexplore.ieee.org
Reinforcement Learning (RL) has become a critical tool for optimization challenges within
automation, leading to significant advancements in several areas. This review article …

Natural Policy Gradient and Actor Critic Methods for Constrained Multi-Task Reinforcement Learning

S Zeng, TT Doan, J Romberg - arXiv preprint arXiv:2405.02456, 2024 - arxiv.org
Multi-task reinforcement learning (RL) aims to find a single policy that effectively solves
multiple tasks at the same time. This paper presents a constrained formulation for multi-task …

Generalization Analysis of Policy Networks: An Example of Double-Integrator

R Zhang, H Han, M Lv, Q Yang, J Cheng - arXiv preprint arXiv:2312.10472, 2023 - arxiv.org
Extensive utilization of deep reinforcement learning (DRL) policy networks in diverse
continuous control tasks has raised questions regarding performance degradation in …

Message Action Adapter Framework in Multi-Agent Reinforcement Learning.

B Park, J Choi - Applied Sciences (2076-3417), 2024 - search.ebscohost.com
Multi-agent reinforcement learning (MARL) has demonstrated significant potential in
enabling cooperative agents. The communication protocol, which is responsible for …

A Dynamic and Task-Independent Reward Shaping Approach for Discrete Partially Observable Markov Decision Processes

S Nahali, H Ayadi, JX Huang, E Pakizeh… - Pacific-Asia Conference …, 2023 - Springer
Agents often need a long time to explore state-action space in order to learn how to act
expectedly in Partially Observable Markov Decision Processes (POMDPs). With the reward …

Sample-Efficient Algorithms for Hard-Exploration Problems in Reinforcement Learning

Y Guo - 2022 - deepblue.lib.umich.edu
Reinforcement learning (RL) aims to learn optimal behaviors for agents to maximize
cumulative rewards through trial-and-error interactions with dynamic environments. In recent …

[引用][C] 元强化学习研究综述

陈奕宇, 霍静, 丁天雨, 高阳 - 软件学报, 2023