Exploration in gradient-based reinforcement learning

SM Kakade - 2003 - search.proquest.com

This thesis is a detailed investigation into the following question: how much data must an
agent collect in order to perform" reinforcement learning" successfully? This question is …

被引用次数：862 相关文章所有 12 个版本

[PDF] nowpublishers.com

An invitation to deep reinforcement learning

B Jaeger, A Geiger - Foundations and Trends® in …, 2024 - nowpublishers.com

Training a deep neural network to maximize a target objective has become the standard
recipe for successful machine learning over the last decade. These networks can be …

被引用次数：6 相关文章所有 5 个版本

[HTML] europepmc.org

Humanoid robotics and neuroscience: Science, engineering and society

G Cheng - 2014 - books.google.com

With contributions from prominent scientists, this volume presents a scientific understanding
of humans with a view towards developing better-engineered systems and machines for …

被引用次数：29 相关文章所有 10 个版本

[PDF] nus.edu.sg

Importance sampling for online planning under uncertainty

Y Luo, H Bai, D Hsu, WS Lee - The International Journal of …, 2019 - journals.sagepub.com

The partially observable Markov decision process (POMDP) provides a principled general
framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo …

被引用次数：69 相关文章所有 8 个版本

[PDF] psu.edu

[图书][B] Planning under uncertainty in complex structured environments

CE Guestrin - 2003 - search.proquest.com

Many real-world tasks require multiple decision makers (agents) to coordinate their actions
in order to achieve common long-term goals. Examples include: manufacturing systems …

被引用次数：131 相关文章所有 14 个版本

[PDF] mit.edu

Importance sampling for reinforcement learning with multiple objectives

CR Shelton - 2001 - dspace.mit.edu

This thesis considers three complications that arise from applying reinforcement learning to
a real-world application. In the process of using reinforcement learning to build an adaptive …

被引用次数：106 相关文章所有 14 个版本

[PDF] psu.edu

[PDF][PDF] Policy-gradient algorithms for partially observable Markov decision processes

D Aberdeen - 2003 - Citeseer

Partially observable Markov decision processes are interesting because of their ability to
model most conceivable real-world learning problems, for example, robot navigation, driving …

被引用次数：126 相关文章所有 10 个版本

[PDF] aps.org

Deep learning-enhanced variational Monte Carlo method for quantum many-body physics

L Yang, Z Leng, G Yu, A Patel, WJ Hu, H Pu - Physical Review Research, 2020 - APS

Artificial neural networks have been successfully incorporated into the variational Monte
Carlo method (VMC) to study quantum many-body systems. However, there have been few …

被引用次数：40 相关文章所有 7 个版本

[PDF] arxiv.org

Learning from scarce experience

L Peshkin, CR Shelton - arXiv preprint cs/0204043, 2002 - arxiv.org

Searching the space of policies directly for the optimal policy has been one popular method
for solving partially observable reinforcement learning problems. Typically, with each …

被引用次数：97 相关文章所有 14 个版本

Two-step gradient-based reinforcement learning for underwater robotics behavior learning

A El-Fakdi, M Carreras - Robotics and Autonomous Systems, 2013 - Elsevier

This article proposes a field application of a Reinforcement Learning (RL) control system for
solving the action selection problem of an autonomous robot in a cable tracking task. The …

被引用次数：67 相关文章所有 4 个版本

高级搜索

QQ 群