J Zhu, T Mao, M Zhang, Q Ge, Q Wu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In multiagent reinforcement learning, policy evaluation is a central problem. To solve this
problem, decentralized temporal-difference (TD) learning is one of the most popular …