A hybrid of deep reinforcement learning and local search for the vehicle routing problems

J Zhao, M Mao, X Zhao, J Zou - IEEE Transactions on Intelligent …, 2020 - ieeexplore.ieee.org
J Zhao, M Mao, X Zhao, J Zou
IEEE Transactions on Intelligent Transportation Systems, 2020ieeexplore.ieee.org
Different variants of the Vehicle Routing Problem (VRP) have been studied for decades.
State-of-the-art methods based on local search have been developed for VRPs, while still
facing problems of slow running time and poor solution quality in the case of large problem
size. To overcome these problems, we first propose a novel deep reinforcement learning
(DRL) model, which is composed of an actor, an adaptive critic and a routing simulator. The
actor, based on the attention mechanism, is designed to generate routing strategies. The …
Different variants of the Vehicle Routing Problem (VRP) have been studied for decades. State-of-the-art methods based on local search have been developed for VRPs, while still facing problems of slow running time and poor solution quality in the case of large problem size. To overcome these problems, we first propose a novel deep reinforcement learning (DRL) model, which is composed of an actor, an adaptive critic and a routing simulator. The actor, based on the attention mechanism, is designed to generate routing strategies. The adaptive critic is devised to change the network structure adaptively, in order to accelerate the convergence rate and improve the solution quality during training. The routing simulator is developed to provide graph information and reward with the actor and adaptive cirtic. Then, we combine this DRL model with a local search method to further improve the solution quality. The output of the DRL model can serve as the initial solution for the following local search method, from where the final solution of the VRP is obtained. Tested on three datasets with customer points of 20, 50 and 100 respectively, experimental results demonstrate that the DRL model alone finds better solutions compared to construction algorithms and previous DRL approaches, while enabling a 5- to 40-fold speedup. We also observe that combining the DRL model with various local search methods yields excellent solutions at a superior generation speed, comparing to that of other initial solutions.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果