B Wang,
J Wu, X Li,
J Shen, Y Zhong - Knowledge-Based Systems, 2022 - Elsevier
In online reinforcement learning, operators predict the return by weighting the successors'
estimated value. However, due to the lack of uncertainty quantification, weights assigned by …