相关文章- 学术资源搜索

Towards robust model-based reinforcement learning against adversarial corruption

C Ye, J He, Q Gu, T Zhang - arXiv preprint arXiv:2402.08991, 2024 - arxiv.org

This study tackles the challenges of adversarial corruption in model-based reinforcement
learning (RL), where the transition dynamics can be corrupted by an adversary. Existing …

被引用次数：2 相关文章所有 3 个版本

[PDF] mlr.press

A model selection approach for corruption robust reinforcement learning

CY Wei, C Dann, J Zimmert - International Conference on …, 2022 - proceedings.mlr.press

We develop a model selection approach to tackle reinforcement learning with adversarial
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …

被引用次数：49 相关文章所有 6 个版本

[PDF] arxiv.org

Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses

T Nguyen, TM Luu, T Ton, CD Yoo - arXiv preprint arXiv:2405.11206, 2024 - arxiv.org

Offline reinforcement learning (RL) addresses the challenge of expensive and high-risk data
exploration inherent in RL by pre-training policies on vast amounts of offline data, enabling …

Robust policy gradient against strong data corruption

X Zhang, Y Chen, X Zhu, W Sun - … Conference on Machine …, 2021 - proceedings.mlr.press

We study the problem of robust reinforcement learning under adversarial corruption on both
rewards and transitions. Our attack model assumes an\textit {adaptive} adversary who can …

被引用次数：36 相关文章所有 8 个版本

[PDF] arxiv.org

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

X Liu, C Deng, Y Sun, Y Liang, F Huang - arXiv preprint arXiv:2402.12673, 2024 - arxiv.org

In light of the burgeoning success of reinforcement learning (RL) in diverse real-world
applications, considerable focus has been directed towards ensuring RL policies are robust …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

Crop: Certifying robust policies for reinforcement learning through functional smoothing

F Wu, L Li, Z Huang, Y Vorobeychik, D Zhao… - arXiv preprint arXiv …, 2021 - arxiv.org

As reinforcement learning (RL) has achieved great success and been even adopted in
safety-critical domains such as autonomous vehicles, a range of empirical studies have …

被引用次数：49 相关文章所有 7 个版本

[PDF] mlr.press

Improved corruption robust algorithms for episodic reinforcement learning

Y Chen, S Du, K Jamieson - International Conference on …, 2021 - proceedings.mlr.press

We study episodic reinforcement learning under unknown adversarial corruptions in both
the rewards and the transition probabilities of the underlying system. We propose new …

被引用次数：28 相关文章所有 5 个版本

[PDF] arxiv.org

A Note on Loss Functions and Error Compounding in Model-based Reinforcement Learning

N Jiang - arXiv preprint arXiv:2404.09946, 2024 - arxiv.org

This note clarifies some confusions (and perhaps throws out more) around model-based
reinforcement learning and their theoretical understanding in the context of deep RL. Main …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Corruption Robust Offline Reinforcement Learning with Human Feedback

D Mandal, A Nika, P Kamalaruban, A Singla… - arXiv preprint arXiv …, 2024 - arxiv.org

We study data corruption robustness for reinforcement learning with human feedback
(RLHF) in an offline setting. Given an offline dataset of pairs of trajectories along with …

被引用次数：3 相关文章所有 3 个版本

[PDF] mlr.press

Combining pessimism with optimism for robust and efficient model-based deep reinforcement learning

S Curi, I Bogunovic, A Krause - International Conference on …, 2021 - proceedings.mlr.press

In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that
are not present during training time. To ensure reliable performance, the RL agents need to …

被引用次数：13 相关文章所有 5 个版本

高级搜索

QQ 群

Towards robust model-based reinforcement learning against adversarial corruption

A model selection approach for corruption robust reinforcement learning

Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses

Robust policy gradient against strong data corruption

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

Crop: Certifying robust policies for reinforcement learning through functional smoothing

Improved corruption robust algorithms for episodic reinforcement learning

A Note on Loss Functions and Error Compounding in Model-based Reinforcement Learning

Corruption Robust Offline Reinforcement Learning with Human Feedback

Combining pessimism with optimism for robust and efficient model-based deep reinforcement learning

相关搜索

引用