The use of distributed key‐value stores (KVS) has experienced fast adoption by various applications in recent years due to key advantages such as hypertext transfer protocol‐based RESTful application programming interface, high availability and elasticity. Due to great scalability characteristics, KVS systems commonly use consistent hashing as data placement mechanism. Although KVS systems offer many advantages, they were not designed to dynamically adapt to changing workloads which often include data access skew. Furthermore, the underlying physical storage nodes may be heterogeneous and do not expose their performance capabilities to higher level data placement layers. In this paper, we address those issues and propose an essential step toward a dynamic autonomous solution by leveraging deep reinforcement learning. We design a self‐learning approach that incrementally changes the data placement, improving the load balancing. Our approach is dynamic in the sense that is capable of avoiding hot spots, that is, overloaded storage nodes when facing different workloads. Also, we design our solution to be pluggable. It assumes no previous knowledge of the storage nodes capabilities, thus different KVS deployments may make use of it. Our experiments show that our method performs well on changing workloads including data access skew aspects. We demonstrate the effectiveness of our approach through experiments in a distributed KVS deployment.