Online estimation via offline estimation: An information-theoretic framework

DJ Foster, Y Han, J Qian, A Rakhlin - arXiv preprint arXiv:2404.10122, 2024 - arxiv.org
$$ The classical theory of statistical estimation aims to estimate a parameter of interest
under data generated from a fixed design (" offline estimation"), while the contemporary …

Scalable Online Exploration via Coverability

P Amortila, DJ Foster, A Krishnamurthy - arXiv preprint arXiv:2403.06571, 2024 - arxiv.org
Exploration is a major challenge in reinforcement learning, especially for high-dimensional
domains that require function approximation. We propose exploration objectives--policy …

Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data

Z Jia, A Rakhlin, A Sekhari, CY Wei - arXiv preprint arXiv:2403.17091, 2024 - arxiv.org
We revisit the problem of offline reinforcement learning with value function realizability but
without Bellman completeness. Previous work by Xie and Jiang (2021) and Foster et …

RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation

J Kwon, S Mannor, C Caramanis, Y Efroni - arXiv preprint arXiv …, 2024 - arxiv.org
In many real-world decision problems there is partially observed, hidden or latent
information that remains fixed throughout an interaction. Such decision problems can be …