[图书][B] Using Exact Models to Analyze Policy Gradient Algorithms

G McCracken - 2021 - search.proquest.com
Policy gradient methods are extensively used in reinforcement learning as a way to optimize
expected return. In this thesis, we build upon the idea that in general, the evolution of a …