research attention in the literature. The aim of IM, which is NP-hard, is to select a set of users known as seed users who can influence the most individuals in the social network. The state- of-the-art algorithms estimate the expected influence of nodes based on sampled diffusion paths. As the number of required samples has been recently proven to be lower bounded by a particular threshold that presets tradeoff between the accuracy and the efficiency, the …
Since its introduction in 2003, the influence maximization (IM) problem has drawn significant research attention in the literature. The aim of IM, which is NP-hard, is to select a set of users known as seed users who can influence the most individuals in the social network. The state-of-the-art algorithms estimate the expected influence of nodes based on sampled diffusion paths. As the number of required samples has been recently proven to be lower bounded by a particular threshold that presets tradeoff between the accuracy and the efficiency, the result quality of these traditional solutions is hard to be further improved without sacrificing efficiency. In this article, we present an orthogonal and novel paradigm to address the IM problem by leveraging deep reinforcement learning (RL) to estimate the expected influence. In particular, we present a novel framework called deeP reInforcement leArning-based iNfluence maximizatiOn (PIANO) that incorporates network embedding and RL techniques to address this problem. In order to make it practical, we further present PIANO-E and PIANO , both of which can be applied directly to answer IM without training the model from scratch. Experimental study on real-world networks demonstrates that PIANO achieves the best performance with respect to efficiency and influence spread quality compared to state-of-the-art classical solutions. We also demonstrate that the learned parametric models generalize well across different networks. Besides, we provide a pool of pretrained PIANO models such that any IM task can be addressed by directly applying a model from the pool without training over the targeted network.