Rethinking decision transformer via hierarchical reinforcement learning

Y Ma, HAO Jianye, H Liang, C Xiao - Forty-first International …, 2023 - openreview.net
Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the
transformer architecture in reinforcement learning (RL). However, a notable limitation of DT …

Pre-training goal-based models for sample-efficient reinforcement learning

H Yuan, Z Mu, F Xie, Z Lu - The Twelfth International Conference on …, 2024 - openreview.net
Pre-training on task-agnostic large datasets is a promising approach for enhancing the
sample efficiency of reinforcement learning (RL) in solving complex tasks. We present …

Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges

X Liu, J Jiao, J Zhang - arXiv preprint arXiv:2401.00031, 2023 - arxiv.org
Decision-making is a dynamic process requiring perception, memory, and reasoning to
make choices and find optimal policies. Traditional approaches to decision-making suffer …

Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL

Q Lv, X Deng, G Chen, MY Wang, L Nie - arXiv preprint arXiv:2406.05427, 2024 - arxiv.org
While the conditional sequence modeling with the transformer architecture has
demonstrated its effectiveness in dealing with offline reinforcement learning (RL) tasks, it is …

Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making

H Wang, Y Pan, F Sun, S Liu, K Talluri, G Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we consider the supervised pretrained transformer for a class of sequential
decision-making problems. The class of considered problems is a subset of the general …

[PDF][PDF] Steering Decision Transformers via Temporal Difference Learning

HL Hsu, AK Bozkurt, J Dong, Q Gao, V Tarokh, M Pajic - cpsl.pratt.duke.edu
Decision Transformers (DTs) have been highly effective for offline reinforcement learning
(RL) tasks, successfully modeling the sequences of actions in a given set of demonstrations …

D2T2: Decision Transformer with Temporal Difference via Steering Guidance

HL Hsu, J Dong, Q Gao, AK Bozkurt, V Tarokh, M Pajic - openreview.net
Despite the promising performance of Decision Transformers (DT) on a wide range of tasks,
recent studies have found that the performance of DT may largely be dependent on the …

Project. Report-xxz&mzc

Z Mu, X Xiao - Peking University Course: Cognitive Reasoning - openreview.net
The Theory of Mind (ToM) ability in multi-agent systems is crucial for coordinating
cooperation and understanding communication. ToM involves the capacity to reason about …

[PDF][PDF] Adaptformer: Sequence models as adaptive iterative planners

A Karthikeyan, YV Pant - aair-lab.github.io
Sequence models have emerged as an alternate paradigm for offline Reinforcement
Learning (RL) with their remarkable generative capabilities. However, it struggles in cases …