The mapping and navigation around small unknown bodies continues to be an extremely interesting and exciting problem in the area of space exploration. Traditionally, the spacecraft trajectory for mapping missions is designed by human experts using hundreds of hours of human time to supervise the navigation and orbit selection process. While the current methodology has performed adequately for previous missions (such as Rosetta, Hayabusa and Deep Space), as the demands for mapping missions expand, additional autonomy during the mapping and navigation process will become necessary for mapping spacecraft. In this work we provide the framework for optimizing the autonomous imaging and mapping problem as a Partially Observable Markov Decision Process (POMDP). In addition, we introduce a new simulation environment which simulates the orbital mapping of small bodies and demonstrate that policies trained with our POMDP formulation are able to maximize map quality while autonomously selecting orbits and supervising imaging tasks. We conclude with a discussion of integrating Deep Reinforcement Learning modules with classic flight software systems, and some of the challenges that could be encountered when using Deep RL in flight-ready systems.