Data-efficient policy evaluation through behavior policy search

S Narvekar, B Peng, M Leonetti, J Sinapov… - Journal of Machine …, 2020 - jmlr.org

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks
in which the agent has only limited environmental feedback. Despite many advances over …

被引用次数：479 相关文章所有 11 个版本

[PDF] mlr.press

Tunenet: One-shot residual tuning for system identification and sim-to-real robot task transfer

A Allevato, ES Short, M Pryor… - Conference on Robot …, 2020 - proceedings.mlr.press

As researchers teach robots to perform more and more complex tasks, the need for realistic
simulation environments is growing. Existing techniques for closing the reality gap by …

被引用次数：63 相关文章所有 6 个版本

[PDF] google.com

Iterative residual tuning for system identification and sim-to-real robot learning

AD Allevato, E Schaertl Short, M Pryor, AL Thomaz - Autonomous Robots, 2020 - Springer

Robots are increasingly learning complex skills in simulation, increasing the need for
realistic simulation environments. Existing techniques for approximating real-world physics …

被引用次数：11 相关文章所有 4 个版本

[PDF] arxiv.org

Causality and batch reinforcement learning: Complementary approaches to planning in unknown domains

J Bannon, B Windsor, W Song, T Li - arXiv preprint arXiv:2006.02579, 2020 - arxiv.org

Reinforcement learning algorithms have had tremendous successes in online learning
settings. However, these successes have relied on low-stakes interactions between the …

被引用次数：11 相关文章所有 4 个版本

[PDF] polimi.it

Optimal policy evaluation for policy optimization

S Meta - 2020 - politesi.polimi.it

Off-policy methods are the basis of a large number of effective Policy Optimization
algorithms. In this setting, Importance Sampling is typically employed as a what-if analysis …

[PDF] arxiv.org

Reinforcement Learning Architectures: SAC, TAC, and ESAC

A Masadeh, Z Wang, AE Kamal - arXiv preprint arXiv:2004.02274, 2020 - arxiv.org

The trend is to implement intelligent agents capable of analyzing available information and
utilize it efficiently. This work presents a number of reinforcement learning (RL) architectures; …

高级搜索

QQ 群