A survey of progress on cooperative multi-agent reinforcement learning in open environment

L Yuan, Z Zhang, L Li, C Guan, Y Yu - arXiv preprint arXiv:2312.01058, 2023 - arxiv.org
Multi-agent Reinforcement Learning (MARL) has gained wide attention in recent years and
has made progress in various fields. Specifically, cooperative MARL focuses on training a …

Graph decision transformer

S Hu, L Shen, Y Zhang, D Tao - arXiv preprint arXiv:2303.03747, 2023 - arxiv.org
Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies
from static trajectory data without interacting with the environment. Recently, offline RL has …

Prompt-tuning decision transformer with preference ranking

S Hu, L Shen, Y Zhang, D Tao - arXiv preprint arXiv:2305.09648, 2023 - arxiv.org
Prompt-tuning has emerged as a promising method for adapting pre-trained models to
downstream tasks or aligning with human preferences. Prompt learning is widely used in …

Saformer: A conditional sequence modeling approach to offline safe reinforcement learning

Q Zhang, L Zhang, H Xu, L Shen, B Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Offline safe RL is of great practical relevance for deploying agents in real-world applications.
However, acquiring constraint-satisfying policies from the fixed dataset is non-trivial for …

Pdit: Interleaving perception and decision-making transformers for deep reinforcement learning

H Mao, R Zhao, Z Li, Z Xu, H Chen, Y Chen… - arXiv preprint arXiv …, 2023 - arxiv.org
Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work studies the former. Specifically, the Perception and …

Instructed diffuser with temporal condition guidance for offline reinforcement learning

J Hu, Y Sun, S Huang, SY Guo, H Chen, L Shen… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent works have shown the potential of diffusion models in computer vision and natural
language processing. Apart from the classical supervised learning fields, diffusion models …

Transformer in transformer as backbone for deep reinforcement learning

H Mao, R Zhao, H Chen, J Hao, Y Chen, D Li… - arXiv preprint arXiv …, 2022 - arxiv.org
Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work focuses on the former. Previous methods build the network …

Q-value regularized transformer for offline reinforcement learning

S Hu, Z Fan, C Huang, L Shen, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action …

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

S Hu, Z Fan, L Shen, Y Zhang, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy
applicable to diverse tasks without the need for online environmental interaction. Recent …

In-context reinforcement learning for variable action spaces

V Sinii, A Nikulin, V Kurenkov, I Zisman… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent work has shown that supervised pre-training on learning histories of RL algorithms
results in a model that captures the learning process and is able to improve in-context on …