Time discretization-invariant safe action repetition for policy gradient methods

D Hafner, J Pasukonis, J Ba, T Lillicrap - arXiv preprint arXiv:2301.04104, 2023 - arxiv.org

Developing a general algorithm that learns to solve tasks across a wide range of
applications has been a fundamental challenge in artificial intelligence. Although current …

被引用次数：523 相关文章所有 2 个版本

[PDF] arxiv.org

Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems

M Giegrich, C Reisinger, Y Zhang - SIAM Journal on Control and Optimization, 2024 - SIAM

We study the global linear convergence of policy gradient (PG) methods for finite-horizon
continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes …

被引用次数：23 相关文章所有 5 个版本

[PDF] acm.org

Examining Responsibility and Deliberation in AI Impact Statements and Ethics Reviews

D Liu, P Nanayakkara, SA Sakha… - Proceedings of the …, 2022 - dl.acm.org

The artificial intelligence research community is continuing to grapple with the ethics of its
work by encouraging researchers to discuss potential positive and negative consequences …

被引用次数：11 相关文章所有 6 个版本

[PDF] arxiv.org

Reinforcement Learning for Jump-Diffusions

X Gao, L Li, XY Zhou - arXiv preprint arXiv:2405.16449, 2024 - arxiv.org

We study continuous-time reinforcement learning (RL) for stochastic control in which system
dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Reinforcement Learning with Elastic Time Steps

D Wang, G Beltrame - arXiv preprint arXiv:2402.14961, 2024 - arxiv.org

Traditional Reinforcement Learning (RL) algorithms are usually applied in robotics to learn
controllers that act with a fixed control rate. Given the discrete nature of RL algorithms, they …

被引用次数：2 相关文章所有 3 个版本

[PDF] aaai.org

Learning Uncertainty-Aware Temporally-Extended Actions

J Lee, SJ Park, Y Tang, M Oh - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

In reinforcement learning, temporal abstraction in the action space is a common approach to
simplifying the learning process of policies through temporally-extended courses of action …

被引用次数：1 相关文章所有 3 个版本

[PDF] neurips.cc

Managing temporal resolution in continuous value estimation: A fundamental trade-off

ZV Zhang, J Kirschner, J Zhang… - Advances in …, 2024 - proceedings.neurips.cc

A default assumption in reinforcement learning (RL) and optimal control is that observations
arrive at discrete time points on a fixed clock cycle. Yet, many applications involve …

被引用次数：4 相关文章所有 8 个版本

[PDF] arxiv.org

Dynamic Decision Frequency with Continuous Options

A Karimi, J Jin, J Luo, AR Mahmood… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org

In classic reinforcement learning algorithms, agents make decisions at discrete and fixed
time intervals. The duration between decisions becomes a crucial hyperparameter, as …

被引用次数：8 相关文章所有 3 个版本

[PDF] ssrn.com

Sublinear regret for an actor-critic algorithm in continuous-time linear-quadratic reinforcement learning

Y Huang, Y Jia, XY Zhou - Available at SSRN 4904358, 2024 - papers.ssrn.com

We study reinforcement learning (RL) for a class of continuous-time linear-quadratic (LQ)
control problems for diffusions where volatility of the state processes depends on both state …

被引用次数：1 相关文章所有 2 个版本

[PDF] aaai.org

Simultaneously updating all persistence values in reinforcement learning

L Sabbioni, L Al Daire, L Bisi, AM Metelli… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Abstract In Reinforcement Learning, the performance of learning agents is highly sensitive to
the choice of time discretization. Agents acting at high frequencies have the best control …

被引用次数：2 相关文章所有 6 个版本

高级搜索

QQ 群