Least-squares policy iteration

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023 - Elsevier

Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …

被引用次数：205 相关文章所有 2 个版本

[PDF] arxiv.org

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arXiv preprint arXiv:2005.01643, 2020 - arxiv.org

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

被引用次数：2143 相关文章所有 3 个版本

[PDF] neurips.cc

Bellman-consistent pessimism for offline reinforcement learning

T Xie, CA Cheng, N Jiang, P Mineiro… - Advances in neural …, 2021 - proceedings.neurips.cc

The use of pessimism, when reasoning about datasets lacking exhaustive exploration has
recently gained prominence in offline reinforcement learning. Despite the robustness it adds …

被引用次数：302 相关文章所有 14 个版本

[PDF] neurips.cc

Conservative q-learning for offline reinforcement learning

A Kumar, A Zhou, G Tucker… - Advances in Neural …, 2020 - proceedings.neurips.cc

Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …

被引用次数：2037 相关文章所有 10 个版本

[HTML] sciencedirect.com

[HTML][HTML] Applications of reinforcement learning in energy systems

ATD Perera, P Kamalaruban - Renewable and Sustainable Energy …, 2021 - Elsevier

Energy systems undergo major transitions to facilitate the large-scale penetration of
renewable energy technologies and improve efficiencies, leading to the integration of many …

被引用次数：353 相关文章所有 7 个版本

[PDF] arxiv.org

Reinforcement learning in healthcare: A survey

C Yu, J Liu, S Nemati, G Yin - ACM Computing Surveys (CSUR), 2021 - dl.acm.org

As a subfield of machine learning, reinforcement learning (RL) aims at optimizing decision
making by using interaction samples of an agent with its environment and the potentially …

被引用次数：780 相关文章所有 5 个版本

[PDF] annualreviews.org

Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org

Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

被引用次数：87 相关文章所有 6 个版本

[PDF] arxiv.org

Partially observable markov decision processes in robotics: A survey

M Lauri, D Hsu, J Pajarinen - IEEE Transactions on Robotics, 2022 - ieeexplore.ieee.org

Noisy sensing, imperfect control, and environment changes are defining characteristics of
many real-world robot tasks. The partially observable Markov decision process (POMDP) …

被引用次数：120 相关文章所有 7 个版本

[PDF] arxiv.org

Reinforcement learning for selective key applications in power systems: Recent advances and future challenges

X Chen, G Qu, Y Tang, S Low… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

With large-scale integration of renewable generation and distributed energy resources,
modern power systems are confronted with new operational challenges, such as growing …

被引用次数：280 相关文章所有 6 个版本

[PDF] mlr.press

A theoretical analysis of deep Q-learning

J Fan, Z Wang, Y Xie, Z Yang - Learning for dynamics and …, 2020 - proceedings.mlr.press

Despite the great empirical success of deep reinforcement learning, its theoretical
foundation is less well understood. In this work, we make the first attempt to theoretically …

被引用次数：857 相关文章所有 9 个版本

高级搜索

QQ 群

Reinforcement learning algorithms: A brief survey

Offline reinforcement learning: Tutorial, review, and perspectives on open problems

Bellman-consistent pessimism for offline reinforcement learning

Conservative q-learning for offline reinforcement learning

[HTML][HTML] Applications of reinforcement learning in energy systems

Reinforcement learning in healthcare: A survey

Toward a theoretical foundation of policy optimization for learning control policies

Partially observable markov decision processes in robotics: A survey

Reinforcement learning for selective key applications in power systems: Recent advances and future challenges

A theoretical analysis of deep Q-learning

引用