Bootstrapping with models: Confidence intervals for off-policy evaluation- 学术资源搜索

文章

学术资源搜索

获得 1 条结果（用时0.07秒）

深度强化学习综述

王浩楠，刘苧，章艺云，冯大伟，黄峰… - 信息与电子工程前沿 …, 2022 - fitee.zjujournals.com

… We provide a detailed review over stateof-the-art RL methods and … Otherwise, you can choose
the model-free off-policy algorithms that re… 3 shows the architecture of bootstrapped DQN. …

高级搜索

QQ 群

深度强化学习综述

引用