Y Wu, G Tucker, O Nachum - arXiv preprint arXiv:1911.11361, 2019 - cs.cmu.edu
In reinforcement learning (RL) research, it is common to assume access to direct online
interactions with the environment. However in many real-world applications, access to the …