S Garg, A Bajpai - Proceedings of the International Conference on …, 2019 - ojs.aaai.org
Neural planners for RDDL MDPs produce deep reactive policies in an offline fashion. These
scale well with large domains, but are sample inefficient and time-consuming to train from …