The Emphatic Approach to Average-Reward Policy Evaluation

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

The Emphatic Approach to Average-Reward Policy Evaluation

在引用文章中搜索

[PDF] ualberta.ca

Learning and Planning with the Average-Reward Formulation

Y Wan - 2023 - era.library.ualberta.ca

The average-reward formulation is a natural and important formulation of learning and
planning problems, yet has received much less attention than the episodic and discounted …

被引用次数：2 相关文章所有 3 个版本

[PDF] openmindresearch.org

[PDF][PDF] Continual Meta Learning

A Sharifnassab - openmindresearch.org

Background: The origins of meta step-size optimization date back to seminal works such as
Kesten's accelerated procedure (Kesten, 1958) and Incrimental Delta-bar-Delta …

高级搜索

QQ 群

The Emphatic Approach to Average-Reward Policy Evaluation

Learning and Planning with the Average-Reward Formulation

[PDF][PDF] Continual Meta Learning

引用