The integration of deep learning and theories of reinforcement learning (RL) is a promising avenue to explore novel hypotheses on reward-based learning and decision-making in humans and other animals. Here, we trained deep RL agents and mice in the same sensorimotor task with high-dimensional state and action space and studied representation learning in their respective neural networks. Evaluation of thousands of neural network models with extensive hyperparameter search revealed that learning-dependent enrichment of state-value and policy representations of the task-performance-optimized deep RL agent closely resembled neural activity of the posterior parietal cortex (PPC). These representations were critical for the task performance in both systems. PPC neurons also exhibited representations of the internally defined subgoal, a feature of deep RL algorithms postulated to improve sample efficiency. Such striking resemblance between the artificial and biological networks and their functional convergence in sensorimotor integration offers new opportunities to better understand respective intelligent systems.