On-policy reinforcement learning via ensemble Gaussian processes with application to resource...- 学术资源搜索

On-policy reinforcement learning via ensemble Gaussian processes with application to resource allocation

KD Polyzos, Q Lu, A Sadeghi… - 2021 55th Asilomar …, 2021 - ieeexplore.ieee.org

KD Polyzos, Q Lu, A Sadeghi, GB Giannakis

2021 55th Asilomar Conference on Signals, Systems, and Computers, 2021•ieeexplore.ieee.org

Reinforcement learning (RL) is an interactive decisionmaking tool with well documented merits for resource allocation tasks in uncertain environments, such as those emerging with Internet-of-Things. While they can attain state-of-the-art performance in several application domains, RL using deep neural networks can be less attractive when the training datasets involved are prohibitively large. Aiming at sample efficiency, this contribution adopts nonparametric value function models using Gaussian processes (GPs). Relying on the temporal-difference update rule, a novel GP-SARSA approach is developed, where the action selection is guided by Thompson sampling to balance exploration and exploitation. Targeting also computational scalability, the advocated approach leverages random features that replace GP-SARSA's nonparametric function learning with a parametric approximate model. Adaptation to unknown dynamics is accomplished through an ensemble (E) of GP-SARSA learners, whose weights are updated in a data-driven fashion. Performance of the proposed (E)GP-SARSA is evaluated on a practical resource allocation problem.

ieeexplore.ieee.org

展开收起

被引用次数：8 相关文章所有 3 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

On-policy reinforcement learning via ensemble Gaussian processes with application to resource allocation

引用