Rich-Observation Reinforcement Learning with Continuous Latent Dynamics

Y Song, L Wu, DJ Foster, A Krishnamurthy - arXiv preprint arXiv …, 2024 - arxiv.org
Sample-efficiency and reliability remain major bottlenecks toward wide adoption of
reinforcement learning algorithms in continuous settings with high-dimensional perceptual …

Deep Non-Parametric Abstractions of Markov Decision Processes for Exact Planning

AK Shrestha - 2024 - ir.library.oregonstate.edu
Abstract Markov Decision Processes (MDPs) provide a powerful framework for sequential
decision making, but solving large-scale MDPs that model real-world problems remains a …