is often unclear what strategies they use to do so. In this paper, we take a step toward
explaining deep RL agents through a case study using Atari 2600 environments. In
particular, we focus on using saliency maps to understand how an agent learns and
executes a policy. We introduce a method for generating useful saliency maps and use it to
show 1) what strong agents attend to, 2) whether agents are making decisions for the right or …