[PDF][PDF] 强化学习算法与应用综述①

李茹杨, 彭慧民, 李仁刚, 赵坤 - 计算机系统应用 - csa.org.cn
… for reinforcement learning. We start with a brief review of the principle of reinforcement
learning, including Markov decision process, value function, and exploration vs exploitation. Next, …