Deep reinforcement learning for dynamic uplink/downlink resource allocation in high mobility 5G HetNet

F Tang, Y Zhou, N Kato - IEEE Journal on selected areas in …, 2020 - ieeexplore.ieee.org
F Tang, Y Zhou, N Kato
IEEE Journal on selected areas in communications, 2020ieeexplore.ieee.org
Recently, the 5G is widely deployed for supporting communications of high mobility nodes
including train, vehicular and unmanned aerial vehicles (UAVs) largely emerged as the
main components for constructing the wireless heterogeneous network (HetNet). To further
improve the radio utilization, the Time Division Duplex (TDD) is considered to be the
potential full-duplex communication technology in the high mobility 5G network. However,
the high mobility of users leads to the high dynamic network traffic and unpredicted link state …
Recently, the 5G is widely deployed for supporting communications of high mobility nodes including train, vehicular and unmanned aerial vehicles (UAVs) largely emerged as the main components for constructing the wireless heterogeneous network (HetNet). To further improve the radio utilization, the Time Division Duplex (TDD) is considered to be the potential full-duplex communication technology in the high mobility 5G network. However, the high mobility of users leads to the high dynamic network traffic and unpredicted link state change. A new method to predict the dynamic traffic and channel condition and schedule the TDD configuration in real-time is essential for the high mobility environment. In this paper, we investigate the channel model in the high mobility and heterogeneous network and proposed a novel deep reinforcement learning based intelligent TDD configuration algorithm to dynamically allocate radio resources in an online manner. In the proposal, the deep neural network is employed to extract the features of the complex network information, and the dynamic Q-value iteration based reinforcement learning with experience replay memory mechanism is proposed to adaptively change TDD Up/Down-link ratio by evaluated rewards. The simulation results show that the proposal achieves significant network performance improvement in terms of both network throughput and packet loss rate, comparing with conventional TDD resource allocation algorithms.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果