In reinforcement learning, the advantage function is critical for policy improvement, but is often extracted from a learned Q-function. A natural question is: Why not learn the advantage …
Z Zhang, Y Gan, X Tan - Proceedings of the AAAI Conference on …, 2022 - ojs.aaai.org
Advantage Learning (AL) seeks to increase the action gap between the optimal action and its competitors, so as to improve the robustness to estimation errors. However, the method …