Provably efficient reward-agnostic navigation with linear value iteration

A Zanette, A Lazaric… - Advances in Neural …, 2020 - proceedings.neurips.cc
There has been growing progress on theoretical analyses for provably efficient learning in
MDPs with linear function approximation, but much of the existing work has made strong …