A survey on offline reinforcement learning: Taxonomy, review, and open problems

S Hu, L Shen, Y Zhang, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …

被引用次数：15 相关文章所有 2 个版本

[PDF] researchgate.net

Autonomous driving system: A comprehensive survey

J Zhao, W Zhao, B Deng, Z Wang, F Zhang… - Expert Systems with …, 2023 - Elsevier

Automation is increasingly at the forefront of transportation research, with the potential to
bring fully autonomous vehicles to our roads in the coming years. This comprehensive …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Settling the sample complexity of model-based offline reinforcement learning

G Li, L Shi, Y Chen, Y Chi, Y Wei - The Annals of Statistics, 2024 - projecteuclid.org

Settling the sample complexity of model-based offline reinforcement learning Page 1 The
Annals of Statistics 2024, Vol. 52, No. 1, 233–260 https://doi.org/10.1214/23-AOS2342 © …

被引用次数：76 相关文章所有 5 个版本

[PDF] arxiv.org

A survey on safety-critical driving scenario generation—A methodological perspective

W Ding, C Xu, M Arief, H Lin, B Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Autonomous driving systems have witnessed significant development during the past years
thanks to the advance in machine learning-enabled sensing and decision-making …

被引用次数：91 相关文章所有 7 个版本

[PDF] mlr.press

Q-learning decision transformer: Leveraging dynamic programming for conditional sequence modelling in offline rl

T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press

Recent works have shown that tackling offline reinforcement learning (RL) with a conditional
policy produces promising results. The Decision Transformer (DT) combines the conditional …

被引用次数：46 相关文章所有 8 个版本

[PDF] mlr.press

Constrained decision transformer for offline safe reinforcement learning

Z Liu, Z Guo, Y Yao, Z Cen, W Yu… - International …, 2023 - proceedings.mlr.press

Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the
environment. We aim to tackle a more challenging problem: learning a safe policy from an …

被引用次数：31 相关文章所有 6 个版本

[PDF] arxiv.org

Interactive natural language processing

Z Wang, G Zhang, K Yang, N Shi, W Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org

Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …

被引用次数：37 相关文章所有 4 个版本

[PDF] arxiv.org

Adaptdiffuser: Diffusion models as adaptive self-evolving planners

Z Liang, Y Mu, M Ding, F Ni, M Tomizuka… - arXiv preprint arXiv …, 2023 - arxiv.org

Diffusion models have demonstrated their powerful generative capability in many tasks, with
great potential to serve as a paradigm for offline reinforcement learning. However, the …

被引用次数：49 相关文章所有 6 个版本

[PDF] mlr.press

Boosting offline reinforcement learning with action preference query

Q Yang, S Wang, MG Lin, S Song… - … on Machine Learning, 2023 - proceedings.mlr.press

Training practical agents usually involve offline and online reinforcement learning (RL) to
balance the policy's performance and interaction costs. In particular, online fine-tuning has …

被引用次数：4 相关文章所有 6 个版本

[PDF] aaai.org

Critic-guided decision transformer for offline reinforcement learning

Y Wang, C Yang, Y Wen, Y Liu, Y Qiao - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the …

被引用次数：7 相关文章所有 2 个版本

高级搜索

QQ 群