Towards robust offline-to-online reinforcement learning via uncertainty and smoothness

H Sun, B van Breugel, J Crabbé… - Advances in …, 2023 - proceedings.neurips.cc

Uncertainty quantification (UQ) is essential for creating trustworthy machine learning
models. Recent years have seen a steep rise in UQ methods that can flag suspicious …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Towards robust offline reinforcement learning under diverse data corruption

R Yang, H Zhong, J Xu, A Zhang, C Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

Offline reinforcement learning (RL) presents a promising approach for learning reinforced
policies from offline datasets without the need for costly or unsafe interactions with the …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Contrastive representation for data filtering in cross-domain offline reinforcement learning

X Wen, C Bai, K Xu, X Yu, Y Zhang, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Cross-domain offline reinforcement learning leverages source domain data with diverse
transition dynamics to alleviate the data requirement for the target domain. However, simply …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Ensemble successor representations for task generalization in offline-to-online reinforcement learning

C Wang, X Yu, C Bai, Q Zhang, Z Wang - Science China Information …, 2024 - Springer

In reinforcement learning (RL), training a policy from scratch with online experiences can be
inefficient because of the difficulties in exploration. Recently, offline RL provides a promising …

被引用次数：1 相关文章所有 2 个版本

[PDF] researchgate.net

[PDF][PDF] What is Flagged in Uncertainty Quantification? Latent Density Models for Uncertainty Categorization

H Sun, B van Breugel, J Crabbe, N Seedat… - arXiv preprint arXiv …, 2022 - researchgate.net

Uncertainty quantification (UQ) is essential for creating trustworthy machine learning
models. Recent years have seen a steep rise in UQ methods that can flag suspicious …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Constrained Ensemble Exploration for Unsupervised Skill Discovery

C Bai, R Yang, Q Zhang, K Xu, Y Chen, T Xiao… - arXiv preprint arXiv …, 2024 - arxiv.org

Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning
useful behaviors via reward-free per-training. Existing methods for unsupervised RL mainly …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning

M Nakhaei, A Scannell, J Pajarinen - arXiv preprint arXiv:2406.08238, 2024 - arxiv.org

Offline reinforcement learning (RL) allows learning sequential behavior from fixed datasets.
Since offline datasets do not cover all possible situations, many methods collect additional …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Entropy Regularized Task Representation Learning for Offline Meta-Reinforcement Learning

A Scannell, J Pajarinen - arXiv preprint arXiv:2412.14834, 2024 - arxiv.org

Offline meta-reinforcement learning aims to equip agents with the ability to rapidly adapt to
new tasks by training on data from a set of different tasks. Context-based approaches utilize …

高级搜索

QQ 群