相关文章- 学术资源搜索

Waypoint transformer: Reinforcement learning via supervised learning with intermediate targets

A Badrinath, Y Flet-Berliac, A Nie… - Advances in Neural …, 2024 - proceedings.neurips.cc

Despite the recent advancements in offline reinforcement learning via supervised learning
(RvS) and the success of the decision transformer (DT) architecture in various domains, DTs …

被引用次数：9 相关文章所有 7 个版本

[PDF] neurips.cc

Playvirtual: Augmenting cycle-consistent virtual trajectories for reinforcement learning

T Yu, C Lan, W Zeng, M Feng… - Advances in Neural …, 2021 - proceedings.neurips.cc

Learning good feature representations is important for deep reinforcement learning (RL).
However, with limited experience, RL often suffers from data inefficiency for training. For un …

被引用次数：26 相关文章所有 9 个版本

[PDF] openreview.net

Rethinking decision transformer via hierarchical reinforcement learning

Y Ma, HAO Jianye, H Liang, C Xiao - Forty-first International …, 2023 - openreview.net

Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the
transformer architecture in reinforcement learning (RL). However, a notable limitation of DT …

被引用次数：4 相关文章所有 3 个版本

[PDF] aaai.org

Critic-guided decision transformer for offline reinforcement learning

Y Wang, C Yang, Y Wen, Y Liu, Y Qiao - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the …

被引用次数：7 相关文章所有 5 个版本

[PDF] neurips.cc

Trajectory-wise multiple choice learning for dynamics generalization in reinforcement learning

Y Seo, K Lee, I Clavera Gilaberte… - Advances in …, 2020 - proceedings.neurips.cc

Abstract Model-based reinforcement learning (RL) has shown great potential in various
control tasks in terms of both sample-efficiency and final performance. However, learning a …

被引用次数：35 相关文章所有 8 个版本

[PDF] arxiv.org

Q-value regularized transformer for offline reinforcement learning

S Hu, Z Fan, C Huang, L Shen, Y Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Conditional Sequence Modeling (CSM), a paradigm that learns the action …

被引用次数：1 相关文章所有 3 个版本

[PDF] neurips.cc

For sale: State-action representation learning for deep reinforcement learning

S Fujimoto, WD Chang, E Smith… - Advances in …, 2024 - proceedings.neurips.cc

In reinforcement learning (RL), representation learning is a proven tool for complex image-
based tasks, but is often overlooked for environments with low-level states, such as physical …

被引用次数：21 相关文章所有 5 个版本

[PDF] mlr.press

A trajectory is worth three sentences: multimodal transformer for offline reinforcement learning

Y Wang, M Xu, L Shi, Y Chi - Uncertainty in Artificial …, 2023 - proceedings.mlr.press

Transformers hold tremendous promise in solving offline reinforcement learning (RL) by
formulating it as a sequence modeling problem inspired by language modeling (LM). Prior …

被引用次数：4 相关文章所有 6 个版本

[PDF] arxiv.org

A short survey on memory based reinforcement learning

D Ramani - arXiv preprint arXiv:1904.06736, 2019 - arxiv.org

Reinforcement learning (RL) is a branch of machine learning which is employed to solve
various sequential decision making problems without proper supervision. Due to the recent …

被引用次数：24 相关文章所有 2 个版本

[PDF] arxiv.org

Efficient deep reinforcement learning requires regulating overfitting

Q Li, A Kumar, I Kostrikov, S Levine - arXiv preprint arXiv:2304.10466, 2023 - arxiv.org

Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from
limited amounts of data collected by actively interacting with the environment. While many …

被引用次数：24 相关文章所有 3 个版本

高级搜索

QQ 群

Waypoint transformer: Reinforcement learning via supervised learning with intermediate targets

Playvirtual: Augmenting cycle-consistent virtual trajectories for reinforcement learning

Rethinking decision transformer via hierarchical reinforcement learning

Critic-guided decision transformer for offline reinforcement learning

Trajectory-wise multiple choice learning for dynamics generalization in reinforcement learning

Q-value regularized transformer for offline reinforcement learning

For sale: State-action representation learning for deep reinforcement learning

A trajectory is worth three sentences: multimodal transformer for offline reinforcement learning

A short survey on memory based reinforcement learning

Efficient deep reinforcement learning requires regulating overfitting

相关搜索

引用