D Brandfonbrener, WF Whitney, R Ranganath… - Proceedings of the 35th …, 2021 - dl.acm.org
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …