Waypoint transformer: Reinforcement learning via supervised learning with intermediate targets

Y Ma, HAO Jianye, H Liang, C Xiao - Forty-first International …, 2023 - openreview.net

Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the
transformer architecture in reinforcement learning (RL). However, a notable limitation of DT …

被引用次数：4 相关文章所有 3 个版本

[PDF] openreview.net

Pre-training goal-based models for sample-efficient reinforcement learning

H Yuan, Z Mu, F Xie, Z Lu - The Twelfth International Conference on …, 2024 - openreview.net

Pre-training on task-agnostic large datasets is a promising approach for enhancing the
sample efficiency of reinforcement learning (RL) in solving complex tasks. We present …

被引用次数：2 相关文章

[PDF] arxiv.org

Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges

X Liu, J Jiao, J Zhang - arXiv preprint arXiv:2401.00031, 2023 - arxiv.org

Decision-making is a dynamic process requiring perception, memory, and reasoning to
make choices and find optimal policies. Traditional approaches to decision-making suffer …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Decision Mamba: A Multi-Grained State Space Model with Self-Evolution Regularization for Offline RL

Q Lv, X Deng, G Chen, MY Wang, L Nie - arXiv preprint arXiv:2406.05427, 2024 - arxiv.org

While the conditional sequence modeling with the transformer architecture has
demonstrated its effectiveness in dealing with offline reinforcement learning (RL) tasks, it is …

Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making

H Wang, Y Pan, F Sun, S Liu, K Talluri, G Chen… - arXiv preprint arXiv …, 2024 - arxiv.org

In this paper, we consider the supervised pretrained transformer for a class of sequential
decision-making problems. The class of considered problems is a subset of the general …

[PDF][PDF] Steering Decision Transformers via Temporal Difference Learning

HL Hsu, AK Bozkurt, J Dong, Q Gao, V Tarokh, M Pajic - cpsl.pratt.duke.edu

Decision Transformers (DTs) have been highly effective for offline reinforcement learning
(RL) tasks, successfully modeling the sequences of actions in a given set of demonstrations …

D2T2: Decision Transformer with Temporal Difference via Steering Guidance

HL Hsu, J Dong, Q Gao, AK Bozkurt, V Tarokh, M Pajic - openreview.net

Despite the promising performance of Decision Transformers (DT) on a wide range of tasks,
recent studies have found that the performance of DT may largely be dependent on the …

[PDF] openreview.net

Project. Report-xxz&mzc

Z Mu, X Xiao - Peking University Course: Cognitive Reasoning - openreview.net

The Theory of Mind (ToM) ability in multi-agent systems is crucial for coordinating
cooperation and understanding communication. ToM involves the capacity to reason about …

[PDF] github.io

[PDF][PDF] Adaptformer: Sequence models as adaptive iterative planners

A Karthikeyan, YV Pant - aair-lab.github.io

Sequence models have emerged as an alternate paradigm for offline Reinforcement
Learning (RL) with their remarkable generative capabilities. However, it struggles in cases …

高级搜索

QQ 群