Queueing network controls via deep reinforcement learning

M Gluzman - Stochastic Systems, 2022 - pubsonline.informs.org
Novel advanced policy gradient (APG) methods, such as trust region policy optimization and
proximal policy optimization (PPO), have become the dominant reinforcement learning …

Queueing Network Controls via Deep Reinforcement Learning

JG Dai, M Gluzman - arXiv preprint arXiv:2008.01644, 2020 - arxiv.org
Novel advanced policy gradient (APG) methods, such as Trust Region policy optimization
and Proximal policy optimization (PPO), have become the dominant reinforcement learning …

Queueing Network Controls via Deep Reinforcement Learning

JG Dai, M Gluzman - arXiv e-prints, 2020 - ui.adsabs.harvard.edu
Novel advanced policy gradient (APG) methods, such as Trust Region policy optimization
and Proximal policy optimization (PPO), have become the dominant reinforcement learning …

[PDF][PDF] Queueing Network Controls via Deep Reinforcement Learning

JG Dai, M Gluzman - ngast.github.io
For more than 30 years, one of the most difficult problems in applied probability and
operations research is to find a scalable algorithm for approximately solving the optimal …