相关文章- 学术资源搜索

Boosting Long-Delayed Reinforcement Learning with Auxiliary Short-Delayed Task

Q Wu, SS Zhan, Y Wang, CW Lin, C Lv, Q Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org

Reinforcement learning is challenging in delayed scenarios, a common real-world situation
where observations and interactions occur with delays. State-of-the-art (SOTA) state …

被引用次数：1 相关文章所有 2 个版本

[PDF] openreview.net

Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays

Q Wu, SS Zhan, Y Wang, Y Wang, CW Lin, C Lv… - Forty-first International … - openreview.net

Reinforcement learning (RL) is challenging in the common case of delays between events
and their sensory perceptions. State-of-the-art (SOTA) state augmentation techniques either …

[PDF] arxiv.org

Variational Delayed Policy Optimization

Q Wu, SS Zhan, Y Wang, Y Wang, CW Lin, C Lv… - arXiv preprint arXiv …, 2024 - arxiv.org

In environments with delayed observation, state augmentation by including actions within
the delay window is adopted to retrieve Markovian property to enable reinforcement learning …

DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays

B Xia, Y Kong, B Yuan, Y Chang, Z Li, X Wang - 2023 - openreview.net

Classic reinforcement learning (RL) frequently struggles with tasks involving delays due to
the violation of the Markov property. Existing approaches usually tackle this issue with end …

[PDF] arxiv.org

Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning

G Li, H Zhou, D Roth, S Thilges, F Otto… - arXiv preprint arXiv …, 2024 - arxiv.org

Current advancements in reinforcement learning (RL) have predominantly focused on
learning step-based policies that generate actions for each perceived state. While these …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Revisiting state augmentation methods for reinforcement learning with stochastic delays

S Nath, M Baranwal, H Khadilkar - Proceedings of the 30th ACM …, 2021 - dl.acm.org

Several real-world scenarios, such as remote control and sensing, are comprised of action
and observation delays. The presence of delays degrades the performance of reinforcement …

被引用次数：21 相关文章所有 6 个版本

Addressing Delays in Reinforcement Learning via Delayed Adversarial Imitation Learning

M Xie, B Xia, Y Yu, X Wang, Y Chang - International Conference on …, 2023 - Springer

Observation and action delays occur commonly in many real-world tasks which violate
Markov property and consequently degrade the performance of Reinforcement Learning …

Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

H Zhou, T Lan, V Aggarwal - arXiv preprint arXiv:2308.14897, 2023 - arxiv.org

Offline reinforcement learning aims to utilize datasets of previously gathered environment-
action interaction records to learn a policy without access to the real environment. Recent …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Time-aware q-networks: Resolving temporal irregularity for deep reinforcement learning

YJ Kim, M Chi - arXiv preprint arXiv:2105.02580, 2021 - arxiv.org

Deep Reinforcement Learning (DRL) has shown outstanding performance on inducing
effective action policies that maximize expected long-term return on many complex tasks …

被引用次数：1 相关文章所有 2 个版本

[PDF] ncsu.edu

Time-aware deep reinforcement learning with multi-temporal abstraction

YJ Kim, M Chi - Applied Intelligence, 2023 - Springer

Deep reinforcement learning (DRL) is advantageous, but it rarely performs well when tested
on real-world decision-making tasks, particularly those involving irregular time series with …

高级搜索

QQ 群

Boosting Long-Delayed Reinforcement Learning with Auxiliary Short-Delayed Task

Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays

Variational Delayed Policy Optimization

DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays

Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning

Revisiting state augmentation methods for reinforcement learning with stochastic delays

Addressing Delays in Reinforcement Learning via Delayed Adversarial Imitation Learning

Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

Time-aware q-networks: Resolving temporal irregularity for deep reinforcement learning

Time-aware deep reinforcement learning with multi-temporal abstraction

相关搜索

引用