Differentiable trust region layers for deep reinforcement learning

A two-level energy management strategy for multi-microgrid systems with interval prediction and reinforcement learning

L Xiong, Y Tang, S Mao, H Liu, K Meng… - … on Circuits and …, 2022 - ieeexplore.ieee.org

Setting retail electricity prices is one of the significant strategies for energy management of
multi-microgrid (MMG) systems integrated with renewable energy. Nevertheless, the need of …

被引用次数：56 相关文章所有 2 个版本

[PDF] mlr.press

Deep black-box reinforcement learning with movement primitives

F Otto, O Celik, H Zhou, H Ziesche… - … on Robot Learning, 2023 - proceedings.mlr.press

Episode-based reinforcement learning (ERL) algorithms treat reinforcement learning (RL) as
a black-box optimization problem where we learn to select a parameter vector of a controller …

被引用次数：24 相关文章所有 5 个版本

[PDF] arxiv.org

Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning

G Li, H Zhou, D Roth, S Thilges, F Otto… - arXiv preprint arXiv …, 2024 - arxiv.org

Current advancements in reinforcement learning (RL) have predominantly focused on
learning step-based policies that generate actions for each perceived state. While these …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Bridging the gap between learning-to-plan, motion primitives and safe reinforcement learning

P Kicki, D Tateo, P Liu, J Guenster, J Peters… - arXiv preprint arXiv …, 2024 - arxiv.org

Trajectory planning under kinodynamic constraints is fundamental for advanced robotics
applications that require dexterous, reactive, and rapid skills in complex environments …

被引用次数：2 相关文章所有 4 个版本

[PDF] jmlr.org

Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning

M Hüttenrauch, G Neumann - Journal of Machine Learning Research, 2024 - jmlr.org

Black-box optimization is a versatile approach to solve complex problems where the
objective function is not explicitly known and no higher order information is available. Due to …

被引用次数：1 相关文章

[PDF] arxiv.org

Mp3: Movement primitive-based (re-) planning policy

F Otto, H Zhou, O Celik, G Li, R Lioutikov… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce a novel deep reinforcement learning (RL) approach called Movement Prmitive-
based Planning Policy (MP3). By integrating movement primitives (MPs) into the deep RL …

被引用次数：6 相关文章所有 2 个版本

[PDF] arxiv.org

Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts

O Celik, A Taranovic, G Neumann - arXiv preprint arXiv:2403.06966, 2024 - arxiv.org

Reinforcement learning (RL) is a powerful approach for acquiring a good-performing policy.
However, learning diverse skills is challenging in RL due to the commonly used Gaussian …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

A unified perspective on natural gradient variational inference with gaussian mixture models

O Arenz, P Dahlinger, Z Ye, M Volpp… - arXiv preprint arXiv …, 2022 - arxiv.org

Variational inference with Gaussian mixture models (GMMs) enables learning of highly
tractable yet multi-modal approximations of intractable target distributions with up to a few …

被引用次数：7 相关文章所有 5 个版本

[PDF] mlr.press

An analytical update rule for general policy optimization

H Li, N Clavette, H He - International Conference on …, 2022 - proceedings.mlr.press

We present an analytical policy update rule that is independent of parametric function
approximators. The policy update rule is suitable for optimizing general stochastic policies …

被引用次数：7 相关文章所有 5 个版本

[PDF] neurips.cc

Wasserstein gradient flows for optimizing Gaussian mixture policies

H Ziesche, L Rozo - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Robots often rely on a repertoire of previously-learned motion policies for performing tasks
of diverse complexities. When facing unseen task conditions or when new task requirements …

被引用次数：4 相关文章所有 6 个版本

高级搜索

QQ 群