program locations where they are uncertain about how to best code the program logic.
Reinforcement learning (RL) is then used to automatically learn to make choice-point
decisions to optimize the reward achieved by the program. In this paper, we consider a new
approach to explaining the learned decisions of adaptive programs. The key idea is to
include simple program annotations that define multiple semantically meaningful reward …