相关文章- 学术资源搜索

Bootstrapped transformer for offline reinforcement learning

K Wang, H Zhao, X Luo, K Ren… - Advances in Neural …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) aims at learning policies from previously collected static
trajectory data without interacting with the real environment. Recent works provide a novel …

被引用次数：41 相关文章所有 7 个版本

[PDF] neurips.cc

Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning

S Wang, Q Yang, J Gao, M Lin… - Advances in …, 2024 - proceedings.neurips.cc

Offline-to-online reinforcement learning (RL) is a training paradigm that combines pre-
training on a pre-collected dataset with fine-tuning in an online environment. However, the …

被引用次数：6 相关文章所有 6 个版本

[PDF] neurips.cc

Adversarial model for offline reinforcement learning

M Bhardwaj, T Xie, B Boots, N Jiang… - Advances in Neural …, 2024 - proceedings.neurips.cc

We propose a novel model-based offline Reinforcement Learning (RL) framework, called
Adversarial Model for Offline Reinforcement Learning (ARMOR), which can robustly learn …

被引用次数：20 相关文章所有 7 个版本

[PDF] arxiv.org

Openai gym

G Brockman, V Cheung, L Pettersson… - arXiv preprint arXiv …, 2016 - arxiv.org

OpenAI Gym is a toolkit for reinforcement learning research. It includes a growing collection
of benchmark problems that expose a common interface, and a website where people can …

被引用次数：7654 相关文章所有 6 个版本

[PDF] neurips.cc

Supported policy optimization for offline reinforcement learning

J Wu, H Wu, Z Qiu, J Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc

Policy constraint methods to offline reinforcement learning (RL) typically utilize
parameterization or regularization that constrains the policy to perform actions within the …

被引用次数：46 相关文章所有 9 个版本

[PDF] neurips.cc

Uncertainty-based offline reinforcement learning with diversified q-ensemble

G An, S Moon, JH Kim… - Advances in neural …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (offline RL), which aims to find an optimal policy from a
previously collected static dataset, bears algorithmic difficulties due to function …

被引用次数：240 相关文章所有 7 个版本

[PDF] openreview.net

Should i run offline reinforcement learning or behavioral cloning?

A Kumar, J Hong, A Singh, S Levine - International Conference on …, 2021 - openreview.net

Offline reinforcement learning (RL) algorithms can acquire effective policies by utilizing only
previously collected experience, without any online interaction. While it is widely understood …

被引用次数：69 相关文章所有 2 个版本

[PDF] mlr.press

Instabilities of offline rl with pre-trained neural representation

R Wang, Y Wu, R Salakhutdinov… - … on Machine Learning, 2021 - proceedings.mlr.press

In offline reinforcement learning (RL), we seek to utilize offline data to evaluate (or learn)
policies in scenarios where the data are collected from a distribution that substantially differs …

被引用次数：50 相关文章所有 8 个版本

[PDF] neurips.cc

Adaptive auxiliary task weighting for reinforcement learning

X Lin, H Baweja, G Kantor… - Advances in neural …, 2019 - proceedings.neurips.cc

Reinforcement learning is known to be sample inefficient, preventing its application to many
real-world problems, especially with high dimensional observations like images …

被引用次数：116 相关文章所有 7 个版本

[PDF] neurips.cc

Oracle inequalities for model selection in offline reinforcement learning

JN Lee, G Tucker, O Nachum, B Dai… - Advances in Neural …, 2022 - proceedings.neurips.cc

In offline reinforcement learning (RL), a learner leverages prior logged data to learn a good
policy without interacting with the environment. A major challenge in applying such methods …

被引用次数：14 相关文章所有 8 个版本

高级搜索

QQ 群

Bootstrapped transformer for offline reinforcement learning

Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning

Adversarial model for offline reinforcement learning

Openai gym

Supported policy optimization for offline reinforcement learning

Uncertainty-based offline reinforcement learning with diversified q-ensemble

Should i run offline reinforcement learning or behavioral cloning?

Instabilities of offline rl with pre-trained neural representation

Adaptive auxiliary task weighting for reinforcement learning

Oracle inequalities for model selection in offline reinforcement learning

引用