PcLast: Discovering Plannable Continuous Latent States

A Koul, S Sujit, S Chen, B Evans, L Wu, B Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
A Koul, S Sujit, S Chen, B Evans, L Wu, B Xu, R Chari, R Islam, R Seraj, Y Efroni, L Molu…
arXiv preprint arXiv:2311.03534, 2023arxiv.org
Goal-conditioned planning benefits from learned low-dimensional representations of rich,
high-dimensional observations. While compact latent representations, typically learned from
variational autoencoders or inverse dynamics, enable goal-conditioned planning they
ignore state affordances, thus hampering their sample-efficient planning capabilities. In this
paper, we learn a representation that associates reachable states together for effective
onward planning. We first learn a latent representation with multi-step inverse dynamics (to …
Goal-conditioned planning benefits from learned low-dimensional representations of rich, high-dimensional observations. While compact latent representations, typically learned from variational autoencoders or inverse dynamics, enable goal-conditioned planning they ignore state affordances, thus hampering their sample-efficient planning capabilities. In this paper, we learn a representation that associates reachable states together for effective onward planning. We first learn a latent representation with multi-step inverse dynamics (to remove distracting information); and then transform this representation to associate reachable states together in space. Our proposals are rigorously tested in various simulation testbeds. Numerical results in reward-based and reward-free settings show significant improvements in sampling efficiency, and yields layered state abstractions that enable computationally efficient hierarchical planning.
arxiv.org
以上显示的是最相近的搜索结果。 查看全部搜索结果