Animals are able to rapidly infer from limited experience when sets of state action pairs have equivalent reward and transition dynamics. On the other hand, modern reinforcement …
We consider the situation when a learner faces a set of unknown discrete distributions $(p_k) _ {k\in\mathcal K} $ defined over a common alphabet $\mathcal X $, and can build for …
Animals are able to rapidly infer, from limited experience, when sets of state-action pairs have equivalent reward and transition dynamics. On the other hand, modern reinforcement …