Macro-actions in reinforcement learning: An empirical analysis

RS Sutton, D Precup, S Singh - Artificial intelligence, 1999 - Elsevier

Learning, planning, and representing knowledge at multiple levels of temporal abstraction
are key, longstanding challenges for AI. In this paper we consider how these challenges can …

被引用次数：4493 相关文章所有 39 个版本

[PDF] openreview.net

Option discovery using deep skill chaining

A Bagaria, G Konidaris - International Conference on Learning …, 2019 - openreview.net

Autonomously discovering temporally extended actions, or skills, is a longstanding goal of
hierarchical reinforcement learning. We propose a new algorithm that combines skill …

被引用次数：125 相关文章所有 10 个版本

[PDF] arxiv.org

Two-dimensional antijamming mobile communication based on reinforcement learning

L Xiao, D Jiang, D Xu, H Zhu, Y Zhang… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org

By using smart radio devices, a jammer can dynamically change its jamming policy based
on opposing security mechanisms; it can even induce the mobile device to enter a specific …

被引用次数：122 相关文章所有 7 个版本

[PDF] arxiv.org

A review of symbolic, subsymbolic and hybrid methods for sequential decision making

C Núñez-Molina, P Mesejo… - ACM Computing …, 2024 - dl.acm.org

In the field of Sequential Decision Making (SDM), two paradigms have historically vied for
supremacy: Automated Planning (AP) and Reinforcement Learning (RL). In the spirit of …

被引用次数：2 相关文章所有 4 个版本

[PDF] neurips.cc

Learning options via compression

Y Jiang, E Liu, B Eysenbach… - Advances in Neural …, 2022 - proceedings.neurips.cc

Identifying statistical regularities in solutions to some tasks in multi-task reinforcement
learning can accelerate the learning of new tasks. Skill learning offers one way of identifying …

被引用次数：12 相关文章所有 7 个版本

[PDF] psu.edu

Fast motion planning from experience: trajectory prediction for speeding up movement generation

N Jetchev, M Toussaint - Autonomous Robots, 2013 - Springer

Trajectory planning and optimization is a fundamental problem in articulated robotics.
Algorithms used typically for this problem compute optimal trajectories from scratch in a new …

被引用次数：95 相关文章所有 8 个版本

[PDF] umass.edu

Between MDPs and Semi-MDPs: Learning, planning, and representing knowledge at multiple temporal scales

RS Sutton - 1998 - scholarworks.umass.edu

Learning, planning, and representing knowledge at multiple levels of temporal abstraction
are key challenges for AI. In this paper we develop an approach to these problems based on …

被引用次数：124 相关文章所有 22 个版本

[PDF] neurips.cc

Assistive teaching of motor control tasks to humans

M Srivastava, E Biyik, S Mirchandani… - Advances in …, 2022 - proceedings.neurips.cc

Recent works on shared autonomy and assistive-AI technologies, such as assistive robotic
teleoperation, seek to model and help human users with limited ability in a fixed task …

被引用次数：5 相关文章所有 9 个版本

[PDF] arxiv.org

Near-optimal optimistic reinforcement learning using empirical bernstein inequalities

A Tossou, D Basu, C Dimitrakakis - arXiv preprint arXiv:1905.12425, 2019 - arxiv.org

We study model-based reinforcement learning in an unknown finite communicating Markov
decision process. We propose a simple algorithm that leverages a variance based …

被引用次数：33 相关文章所有 5 个版本

[PDF] essex.ac.uk

Solving the physical traveling salesman problem: Tree search and macro actions

D Perez, EJ Powley, D Whitehouse… - … Intelligence and AI …, 2013 - ieeexplore.ieee.org

This paper presents a number of approaches for solving a real-time game consisting of a
ship that must visit a number of waypoints scattered around a 2-D maze full of obstacles. The …

被引用次数：68 相关文章所有 11 个版本

高级搜索

QQ 群