Offline rl without off-policy evaluation

D Brandfonbrener, W Whitney… - Advances in neural …, 2021 - proceedings.neurips.cc
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, WF Whitney… - 35th Conference on …, 2021 - nyuscholars.nyu.edu
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL without off-policy evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - Proceedings of the 35th …, 2021 - dl.acm.org
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - Advances in Neural … - openreview.net
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, W Whitney… - Advances in Neural …, 2021 - proceedings.neurips.cc
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - arXiv preprint arXiv …, 2021 - arxiv.org
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - arXiv e …, 2021 - ui.adsabs.harvard.edu
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - Advances in neural …, 2021 - par.nsf.gov
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RL Without Off-Policy Evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - Advances in Neural … - openreview.net
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …

Offline RLWithout Off-Policy Evaluation

D Brandfonbrener, WF Whitney, R Ranganath… - Advances in neural …, 2021 - par.nsf.gov
Most prior approaches to offline reinforcement learning (RL) have taken an iterative actor-
critic approach involving off-policy evaluation. In this paper we show that simply doing one …