J Fu, A Kumar, O Nachum, G Tucker… - arXiv e-prints, 2020 - ui.adsabs.harvard.edu
The offline reinforcement learning (RL) setting (also known as full batch RL), where a policy is learned from a static dataset, is compelling as progress enables RL methods to take …
J Fu, A Kumar, O Nachum, G Tucker, S Levine - openreview.net
The offline reinforcement learning (RL) problem, also known as batch RL, refers to the setting where a policy must be learned from a static dataset, without additional online data …