A Huang, J Chen, N Jiang - International Conference on …, 2023 - proceedings.mlr.press
MDPs with low-rank transitions—that is, the transition matrix can be factored into the product of two matrices, left and right—is a highly representative structure that enables tractable …
In this paper, we study nonparametric estimation of instrumental variable (IV) regressions. Recently, many flexible machine learning methods have been developed for instrumental …
We consider offline reinforcement learning (RL) where we only have only access to offline data. In contrast to numerous offline RL algorithms that necessitate the uniform coverage of …
N Kallus, M Uehara - arXiv preprint arXiv:1909.05850, 2019 - arxiv.org
Off-policy evaluation (OPE) in reinforcement learning is notoriously difficult in long-and infinite-horizon settings due to diminishing overlap between behavior and target policies. In …
P Amortila, N Jiang… - … Conference on Machine …, 2023 - proceedings.mlr.press
Theoretical guarantees in reinforcement learning (RL) are known to suffer multiplicative blow-up factors with respect to the misspecification error of function approximation. Yet, the …
D Cao, A Zhou - arXiv preprint arXiv:2406.08697, 2024 - arxiv.org
Offline reinforcement learning is important in many settings with available observational data but the inability to deploy new policies online due to safety, cost, and other concerns. Many …
We study the problem of offline decision making, which focuses on learning decisions from datasets only partially correlated with the learning objective. While previous research has …
C Mao - arXiv preprint arXiv:2305.12679, 2023 - arxiv.org
We study learning optimal policies from a logged dataset, ie, offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample …
Reinforcement Learning (RL) is an area of machine learning where an intelligent agent solves sequential decision-making problems based on experience. Recent advances in the …