On Transforming Reinforcement Learning With Transformers: The Development Trajectory

L Yuan, Z Zhang, L Li, C Guan, Y Yu - arXiv preprint arXiv:2312.01058, 2023 - arxiv.org

Multi-agent Reinforcement Learning (MARL) has gained wide attention in recent years and
has made progress in various fields. Specifically, cooperative MARL focuses on training a …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Graph decision transformer

S Hu, L Shen, Y Zhang, D Tao - arXiv preprint arXiv:2303.03747, 2023 - arxiv.org

Offline reinforcement learning (RL) is a challenging task, whose objective is to learn policies
from static trajectory data without interacting with the environment. Recently, offline RL has …

被引用次数：14 相关文章所有 3 个版本

[PDF] arxiv.org

Prompt-tuning decision transformer with preference ranking

S Hu, L Shen, Y Zhang, D Tao - arXiv preprint arXiv:2305.09648, 2023 - arxiv.org

Prompt-tuning has emerged as a promising method for adapting pre-trained models to
downstream tasks or aligning with human preferences. Prompt learning is widely used in …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

Saformer: A conditional sequence modeling approach to offline safe reinforcement learning

Q Zhang, L Zhang, H Xu, L Shen, B Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Offline safe RL is of great practical relevance for deploying agents in real-world applications.
However, acquiring constraint-satisfying policies from the fixed dataset is non-trivial for …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Pdit: Interleaving perception and decision-making transformers for deep reinforcement learning

H Mao, R Zhao, Z Li, Z Xu, H Chen, Y Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work studies the former. Specifically, the Perception and …

被引用次数：3 相关文章所有 6 个版本

[PDF] arxiv.org

Instructed diffuser with temporal condition guidance for offline reinforcement learning

J Hu, Y Sun, S Huang, SY Guo, H Chen, L Shen… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent works have shown the potential of diffusion models in computer vision and natural
language processing. Apart from the classical supervised learning fields, diffusion models …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Transformer in transformer as backbone for deep reinforcement learning

H Mao, R Zhao, H Chen, J Hao, Y Chen, D Li… - arXiv preprint arXiv …, 2022 - arxiv.org

Designing better deep networks and better reinforcement learning (RL) algorithms are both
important for deep RL. This work focuses on the former. Previous methods build the network …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

Q-value regularized transformer for offline reinforcement learning

S Hu, Z Fan, C Huang, L Shen, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

HarmoDT: Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning

S Hu, Z Fan, L Shen, Y Zhang, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy
applicable to diverse tasks without the need for online environmental interaction. Recent …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

In-context reinforcement learning for variable action spaces

V Sinii, A Nikulin, V Kurenkov, I Zisman… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent work has shown that supervised pre-training on learning histories of RL algorithms
results in a model that captures the learning process and is able to improve in-context on …

被引用次数：2 相关文章所有 3 个版本

高级搜索

QQ 群