R Zhong,
D Zhang,
L Schäfer… - Advances in Neural …, 2022 - proceedings.neurips.cc
Reinforcement learning (RL) algorithms are often categorized as either on-policy or off-
policy depending on whether they use data from a target policy of interest or from a different …