On Transforming Reinforcement Learning With Transformers: The Development Trajectory

S Hu, L Shen, Y Zhang, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …

Autonomous driving system: A comprehensive survey

J Zhao, W Zhao, B Deng, Z Wang, F Zhang… - Expert Systems with …, 2023 - Elsevier
Automation is increasingly at the forefront of transportation research, with the potential to
bring fully autonomous vehicles to our roads in the coming years. This comprehensive …

Settling the sample complexity of model-based offline reinforcement learning

G Li, L Shi, Y Chen, Y Chi, Y Wei - The Annals of Statistics, 2024 - projecteuclid.org
Settling the sample complexity of model-based offline reinforcement learning Page 1 The
Annals of Statistics 2024, Vol. 52, No. 1, 233–260 https://doi.org/10.1214/23-AOS2342 © …

A survey on safety-critical driving scenario generation—A methodological perspective

W Ding, C Xu, M Arief, H Lin, B Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Autonomous driving systems have witnessed significant development during the past years
thanks to the advance in machine learning-enabled sensing and decision-making …

Q-learning decision transformer: Leveraging dynamic programming for conditional sequence modelling in offline rl

T Yamagata, A Khalil… - … on Machine Learning, 2023 - proceedings.mlr.press
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional
policy produces promising results. The Decision Transformer (DT) combines the conditional …

Constrained decision transformer for offline safe reinforcement learning

Z Liu, Z Guo, Y Yao, Z Cen, W Yu… - International …, 2023 - proceedings.mlr.press
Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the
environment. We aim to tackle a more challenging problem: learning a safe policy from an …

Interactive natural language processing

Z Wang, G Zhang, K Yang, N Shi, W Zhou… - arXiv preprint arXiv …, 2023 - arxiv.org
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …

Adaptdiffuser: Diffusion models as adaptive self-evolving planners

Z Liang, Y Mu, M Ding, F Ni, M Tomizuka… - arXiv preprint arXiv …, 2023 - arxiv.org
Diffusion models have demonstrated their powerful generative capability in many tasks, with
great potential to serve as a paradigm for offline reinforcement learning. However, the …

Boosting offline reinforcement learning with action preference query

Q Yang, S Wang, MG Lin, S Song… - … on Machine Learning, 2023 - proceedings.mlr.press
Training practical agents usually involve offline and online reinforcement learning (RL) to
balance the policy's performance and interaction costs. In particular, online fine-tuning has …

Critic-guided decision transformer for offline reinforcement learning

Y Wang, C Yang, Y Wen, Y Liu, Y Qiao - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Recent advancements in offline reinforcement learning (RL) have underscored the
capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the …