[HTML][HTML] The Paths Perspective on Value Learning

S Greydanus, C Olah - Distill, 2019 - distill.pub
S Greydanus, C Olah
Distill, 2019distill.pub
One of the key sub-problems of RL is value estimation–learning the long-term
consequences of being in a state. This can be tricky because future returns are generally
noisy, affected by many things other than the present state. The further we look into the
future, the more this becomes true. But while difficult, estimating value is also essential to
many approaches to RL. For many approaches (policy-value iteration), estimating value
essentially is the whole problem, while in other approaches (actor-critic models), value …
One of the key sub-problems of RL is value estimation–learning the long-term consequences of being in a state. This can be tricky because future returns are generally noisy, affected by many things other than the present state. The further we look into the future, the more this becomes true. But while difficult, estimating value is also essential to many approaches to RL. For many approaches (policy-value iteration), estimating value essentially is the whole problem, while in other approaches (actor-critic models), value estimation is essential for reducing noise.
distill.pub
以上显示的是最相近的搜索结果。 查看全部搜索结果