optimizing trajectories offline reinforcement learning- 学术资源搜索

[HTML][HTML] Optimizing trajectories for highway driving with offline reinforcement learning

B Mirchevska, M Werling, J Boedecker - Frontiers in Future …, 2023 - frontiersin.org

… propose a Reinforcement Learning-based approach, which learns target trajectory parameters
for fully autonomous driving on highways. The trained agent outputs continuous trajectory …

被引用次数：5 相关文章所有 5 个版本

[PDF] neurips.cc

Offline reinforcement learning as one big sequence modeling problem

M Janner, Q Li, S Levine - Advances in neural information …, 2021 - proceedings.neurips.cc

… When the goal is to reproduce the distribution of trajectories in the training data, we can
optimize directly for the probability of a trajectory τ. This situation matches the goal of sequence …

被引用次数：609 相关文章所有 8 个版本

[PDF] ifaamas.org

[PDF][PDF] A Trajectory Perspective on the Role of Data Sampling Techniques in Offline Reinforcement Learning

J Liu, Y Ma, J Hao, Y Hu, Y Zheng, T Lv… - Proceedings of the 23rd …, 2024 - ifaamas.org

… offline trajectory data, we investigate the impact of data sampling processes on offline RL
algorithms from a trajectory … In this section, we evaluate PTR, which optimizes the trajectory …

被引用次数：1 相关文章所有 2 个版本

[PDF] aaai.org

Critic-guided decision transformer for offline reinforcement learning

Y Wang, C Yang, Y Wen, Y Liu, Y Qiao - Proceedings of the AAAI …, 2024 - ojs.aaai.org

… optimal and suboptimal trajectories without predefined returns, often resulting in suboptimal
policies that mirror the distribution of the training data. To overcome the limitations of IL, …

被引用次数：7 相关文章所有 2 个版本

[PDF] neurips.cc

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

ZW Hong, A Kumar, S Karnik… - Advances in …, 2023 - proceedings.neurips.cc

… over the average return of trajectories in the dataset. We … offline RL algorithms of staying
close to the trajectories in the dataset. If the dataset primarily consists of sub-optimal trajectories, …

被引用次数：6 相关文章所有 5 个版本

[PDF] mlr.press

Safe offline reinforcement learning with real-time budget constraints

Q Lin, B Tang, Z Wu, C Yu, S Mao… - … Machine Learning, 2023 - proceedings.mlr.press

… To model the optimal trajectory distribution wrt a certain … To obtain the optimal trajectory
distribution in Theorem 4.1, … 2022), a recently proposed trajectory optimization framework that …

被引用次数：4 相关文章所有 6 个版本

[PDF] neurips.cc

Lapo: Latent-variable advantage-weighted policy optimization for offline reinforcement learning

X Chen, A Ghadirzadeh, T Yu, J Wang… - Advances in …, 2022 - proceedings.neurips.cc

… In this paper, we study an offline RL setup for learning from heterogeneous datasets where
trajectories are collected using policies with different purposes, leading to a multi-modal data …

被引用次数：17 相关文章所有 3 个版本

[PDF] arxiv.org

Offline reinforcement learning with implicit q-learning

I Kostrikov, A Nair, S Levine - arXiv preprint arXiv:2110.06169, 2021 - arxiv.org

… We examine domains that contain near-optimal trajectories, where single-step methods
perform well, as well as domains with no optimal trajectories at all, which require multi-step …

被引用次数：606 相关文章所有 6 个版本

[PDF] arxiv.org

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org

… Another way to optimize the reinforcement learning objective … to then recover a near-optimal
policy. A value function provides … sample new trajectories from πβ, while old trajectories are …

被引用次数：1732 相关文章所有 3 个版本

[PDF] ieee.org

A survey on offline reinforcement learning: Taxonomy, review, and open problems

RF Prudencio, MROA Maximo… - … Networks and Learning …, 2023 - ieeexplore.ieee.org

… ] and trajectory optimization [27… with learning an optimal policy and an optimal trajectory
distribution, respectively. Currently, a limited number of works have reviewed the field of offline …

被引用次数：212 相关文章所有 7 个版本

高级搜索

QQ 群

[HTML][HTML] Optimizing trajectories for highway driving with offline reinforcement learning

Offline reinforcement learning as one big sequence modeling problem

[PDF][PDF] A Trajectory Perspective on the Role of Data Sampling Techniques in Offline Reinforcement Learning

Critic-guided decision transformer for offline reinforcement learning

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

Safe offline reinforcement learning with real-time budget constraints

Lapo: Latent-variable advantage-weighted policy optimization for offline reinforcement learning

Offline reinforcement learning with implicit q-learning

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

A survey on offline reinforcement learning: Taxonomy, review, and open problems

相关搜索

引用