Q-learning with experience replay in a dynamic environment

Recent progress in reinforcement learning and adaptive dynamic programming for advanced control applications

D Wang, N Gao, D Liu, J Li… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org

Reinforcement learning (RL) has roots in dynamic programming and it is called
adaptive/approximate dynamic programming (ADP) within the control community. This paper …

被引用次数：31 相关文章所有 3 个版本

[PDF] researchgate.net

Stabilizing reinforcement learning in dynamic environment with application to online recommendation

SY Chen, Y Yu, Q Da, J Tan, HK Huang… - Proceedings of the 24th …, 2018 - dl.acm.org

Deep reinforcement learning has shown great potential in improving system performance
autonomously, by learning from iterations with the environment. However, traditional …

被引用次数：155 相关文章所有 3 个版本

[PDF] springer.com

Explainable reinforcement learning for broad-xai: a conceptual framework and survey

R Dazeley, P Vamplew, F Cruz - Neural Computing and Applications, 2023 - Springer

Broad-XAI moves away from interpreting individual decisions based on a single datum and
aims to provide integrated explanations from multiple machine learning algorithms into a …

被引用次数：49 相关文章所有 9 个版本

[PDF] jmlr.org

Experience selection in deep reinforcement learning for control

T De Bruin, J Kober, K Tuyls, R Babuška - Journal of Machine Learning …, 2018 - jmlr.org

Experience replay is a technique that allows off-policy reinforcement-learning methods to
reuse past experiences. The stability and speed of convergence of reinforcement learning …

被引用次数：80 相关文章所有 11 个版本

[PDF] mdpi.com

Coexistence scheme for uncoordinated LTE and WiFi networks using experience replay based Q-learning

M Girmay, V Maglogiannis, D Naudts, A Shahid… - Sensors, 2021 - mdpi.com

Nowadays, broadband applications that use the licensed spectrum of the cellular network
are growing fast. For this reason, Long-Term Evolution-Unlicensed (LTE-U) technology is …

被引用次数：17 相关文章所有 15 个版本

[PDF] iastate.edu

Reinforcement learning exploration algorithms for energy harvesting communications systems

A Masadeh, Z Wang, AE Kamal - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

Prolonging the lifetime, and maximizing the throughput are important factors in designing an
efficient communications system, especially for energy harvesting-based systems. In this …

被引用次数：35 相关文章所有 5 个版本

An experience replay method based on tree structure for reinforcement learning

WC Jiang, KS Hwang, JL Lin - IEEE Transactions on Emerging …, 2019 - ieeexplore.ieee.org

Q-Learning, which is a well-known model-free reinforcement learning algorithm, a learning
agent explores an environment to update a state-action function. In reinforcement learning …

被引用次数：18 相关文章所有 2 个版本

[PDF] ieee.org

Model learning and knowledge sharing for cooperative multiagent systems in stochastic environment

WC Jiang, V Narayanan, JS Li - IEEE transactions on …, 2020 - ieeexplore.ieee.org

An imposing task for a reinforcement learning agent in an uncertain environment is to
expeditiously learn a policy or a sequence of actions, with which it can achieve the desired …

被引用次数：13 相关文章所有 8 个版本

[PDF] ieee.org

Batch Learning SDDP for Long-Term Hydrothermal Planning

D Ávila, A Papavasiliou… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

We consider the stochastic dual dynamic programming (SDDP) algorithm-a widely
employed algorithm applied to multistage stochastic programming-and propose a variant …

被引用次数：1 相关文章所有 6 个版本

[图书][B] Methods and Applications of Autonomous Experimentation

M Noack, D Ushizima - 2024 - api.taylorfrancis.com

Just like so many other topics that have been adopted into the realm of machine learning
and AI—deep learning, digital twins, active learning, and so on—Autonomous …

被引用次数：1 相关文章所有 3 个版本

高级搜索

QQ 群