Offer: Off-environment reinforcement learning

B Jang, M Kim, G Harerimana, JW Kim - IEEE access, 2019 - ieeexplore.ieee.org

Q-learning is arguably one of the most applied representative reinforcement learning
approaches and one of the off-policy strategies. Since the emergence of Q-learning, many …

被引用次数：585 相关文章所有 6 个版本

[HTML] sciencedirect.com

[HTML][HTML] Reinforcement learning for swarm robotics: An overview of applications, algorithms and simulators

MA Blais, MA Akhloufi - Cognitive Robotics, 2023 - Elsevier

Robots such as drones, ground rovers, underwater vehicles and industrial robots have
increased in popularity in recent years. Many sectors have benefited from this by increasing …

被引用次数：30 相关文章所有 2 个版本

[PDF] arxiv.org

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer

Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

被引用次数：687 相关文章所有 8 个版本

[PDF] arxiv.org

Adversarial evaluation of autonomous vehicles in lane-change scenarios

B Chen, X Chen, Q Wu, L Li - IEEE transactions on intelligent …, 2021 - ieeexplore.ieee.org

Autonomous vehicles must be comprehensively evaluated before deployed in cities and
highways. However, most existing evaluation approaches for autonomous vehicles are static …

被引用次数：99 相关文章所有 8 个版本

[PDF] neurips.cc

Subgaussian and differentiable importance sampling for off-policy evaluation and learning

AM Metelli, A Russo, M Restelli - Advances in neural …, 2021 - proceedings.neurips.cc

Importance Sampling (IS) is a widely used building block for a large variety of off-policy
estimation and learning algorithms. However, empirical and theoretical studies have …

被引用次数：37 相关文章所有 12 个版本

[PDF] researchgate.net

[PDF][PDF] Is multiagent deep reinforcement learning the answer or the question? A brief survey

P Hernandez-Leal, B Kartal, ME Taylor - learning, 2018 - researchgate.net

Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

被引用次数：106 相关文章所有 2 个版本

[PDF] jmlr.org

Experience selection in deep reinforcement learning for control

T De Bruin, J Kober, K Tuyls, R Babuška - Journal of Machine Learning …, 2018 - jmlr.org

Experience replay is a technique that allows off-policy reinforcement-learning methods to
reuse past experiences. The stability and speed of convergence of reinforcement learning …

被引用次数：83 相关文章所有 11 个版本

[PDF] arxiv.org

AI research considerations for human existential safety (ARCHES)

A Critch, D Krueger - arXiv preprint arXiv:2006.04948, 2020 - arxiv.org

Framed in positive terms, this report examines how technical AI research might be steered in
a manner that is more attentive to humanity's long-term prospects for survival as a species …

被引用次数：59 相关文章所有 3 个版本

[PDF] springer.com

Importance sampling in reinforcement learning with an estimated behavior policy

JP Hanna, S Niekum, P Stone - Machine Learning, 2021 - Springer

In reinforcement learning, importance sampling is a widely used method for evaluating an
expectation under the distribution of data of one policy when the data has in fact been …

被引用次数：37 相关文章所有 13 个版本

[PDF] aaai.org

Expected policy gradients

K Ciosek, S Whiteson - Proceedings of the AAAI Conference on …, 2018 - ojs.aaai.org

We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG)
and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected …

被引用次数：88 相关文章所有 11 个版本

高级搜索

QQ 群