CORL: Research-oriented deep offline reinforcement learning library

H He, C Bai, K Xu, Z Yang, W Zhang… - Advances in neural …, 2023 - proceedings.neurips.cc

Diffusion models have demonstrated highly-expressive generative capabilities in vision and
NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are …

被引用次数：71 相关文章所有 5 个版本

[PDF] neurips.cc

Synthetic experience replay

C Lu, P Ball, YW Teh… - Advances in Neural …, 2024 - proceedings.neurips.cc

A key theme in the past decade has been that when large neural networks and large
datasets combine they can produce remarkable results. In deep reinforcement learning (RL) …

被引用次数：60 相关文章所有 8 个版本

[PDF] neurips.cc

Corruption-robust offline reinforcement learning with general function approximation

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc

We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …

被引用次数：18 相关文章所有 7 个版本

[PDF] mlr.press

Anti-exploration by random network distillation

A Nikulin, V Kurenkov, D Tarasov… - … on Machine Learning, 2023 - proceedings.mlr.press

Despite the success of Random Network Distillation (RND) in various domains, it was shown
as not discriminative enough to be used as an uncertainty estimator for penalizing out-of …

被引用次数：27 相关文章所有 6 个版本

[PDF] neurips.cc

Revisiting the minimalist approach to offline reinforcement learning

D Tarasov, V Kurenkov, A Nikulin… - Advances in Neural …, 2024 - proceedings.neurips.cc

Recent years have witnessed significant advancements in offline reinforcement learning
(RL), resulting in the development of numerous algorithms with varying degrees of …

被引用次数：26 相关文章所有 6 个版本

[PDF] arxiv.org

Unleashing the power of pre-trained language models for offline reinforcement learning

R Shi, Y Liu, Y Ze, SS Du, H Xu - arXiv preprint arXiv:2310.20587, 2023 - arxiv.org

Offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected
datasets. In real-world scenarios, data collection could be costly and risky; therefore, offline …

被引用次数：23 相关文章所有 6 个版本

[PDF] arxiv.org

Towards robust offline reinforcement learning under diverse data corruption

R Yang, H Zhong, J Xu, A Zhang, C Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

Offline reinforcement learning (RL) presents a promising approach for learning reinforced
policies from offline datasets without the need for costly or unsafe interactions with the …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Uni-o4: Unifying online and offline deep reinforcement learning with multi-step on-policy optimization

K Lei, Z He, C Lu, K Hu, Y Gao, H Xu - arXiv preprint arXiv:2311.03351, 2023 - arxiv.org

Combining offline and online reinforcement learning (RL) is crucial for efficient and safe
learning. However, previous approaches treat offline and online learning as separate …

被引用次数：17 相关文章所有 3 个版本

[PDF] aaai.org

Act: Empowering decision transformer with dynamic programming via advantage conditioning

CX Gao, C Wu, M Cao, R Kong, Z Zhang… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Decision Transformer (DT), which employs expressive sequence modeling techniques to
perform action generation, has emerged as a promising approach to offline policy …

被引用次数：12 相关文章所有 3 个版本

[PDF] arxiv.org

Reinformer: Max-return sequence modeling for offline rl

Z Zhuang, D Peng, J Liu, Z Zhang, D Wang - arXiv preprint arXiv …, 2024 - arxiv.org

As a data-driven paradigm, offline reinforcement learning (RL) has been formulated as
sequence modeling that conditions on the hindsight information including returns, goal or …

被引用次数：6 相关文章所有 3 个版本

高级搜索

QQ 群