所有版本 - 学术资源搜索

文章

学术资源搜索

获得 3 条结果（用时0.01秒）

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

T Liu, Y Li, Y Lan, H Gao, W Pan, X Xu - arXiv preprint arXiv:2405.19909, 2024 - arxiv.org

In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

T Liu, Y Li, Y Lan, H Gao, W Pan, X Xu - Forty-first International Conference … - openreview.net

In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

T Liu, Y Li, Y Lan, H Gao, W Pan, X Xu - arXiv e-prints, 2024 - ui.adsabs.harvard.edu

In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …

高级搜索

QQ 群

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning

引用