Towards robust model-based reinforcement learning against adversarial corruption

C Ye, J He, Q Gu, T Zhang - arXiv preprint arXiv:2402.08991, 2024 - arxiv.org
This study tackles the challenges of adversarial corruption in model-based reinforcement
learning (RL), where the transition dynamics can be corrupted by an adversary. Existing …

A model selection approach for corruption robust reinforcement learning

CY Wei, C Dann, J Zimmert - International Conference on …, 2022 - proceedings.mlr.press
We develop a model selection approach to tackle reinforcement learning with adversarial
corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …

Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses

T Nguyen, TM Luu, T Ton, CD Yoo - arXiv preprint arXiv:2405.11206, 2024 - arxiv.org
Offline reinforcement learning (RL) addresses the challenge of expensive and high-risk data
exploration inherent in RL by pre-training policies on vast amounts of offline data, enabling …

Robust policy gradient against strong data corruption

X Zhang, Y Chen, X Zhu, W Sun - … Conference on Machine …, 2021 - proceedings.mlr.press
We study the problem of robust reinforcement learning under adversarial corruption on both
rewards and transitions. Our attack model assumes an\textit {adaptive} adversary who can …

Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated Policies

X Liu, C Deng, Y Sun, Y Liang, F Huang - arXiv preprint arXiv:2402.12673, 2024 - arxiv.org
In light of the burgeoning success of reinforcement learning (RL) in diverse real-world
applications, considerable focus has been directed towards ensuring RL policies are robust …

Crop: Certifying robust policies for reinforcement learning through functional smoothing

F Wu, L Li, Z Huang, Y Vorobeychik, D Zhao… - arXiv preprint arXiv …, 2021 - arxiv.org
As reinforcement learning (RL) has achieved great success and been even adopted in
safety-critical domains such as autonomous vehicles, a range of empirical studies have …

Improved corruption robust algorithms for episodic reinforcement learning

Y Chen, S Du, K Jamieson - International Conference on …, 2021 - proceedings.mlr.press
We study episodic reinforcement learning under unknown adversarial corruptions in both
the rewards and the transition probabilities of the underlying system. We propose new …

A Note on Loss Functions and Error Compounding in Model-based Reinforcement Learning

N Jiang - arXiv preprint arXiv:2404.09946, 2024 - arxiv.org
This note clarifies some confusions (and perhaps throws out more) around model-based
reinforcement learning and their theoretical understanding in the context of deep RL. Main …

Corruption Robust Offline Reinforcement Learning with Human Feedback

D Mandal, A Nika, P Kamalaruban, A Singla… - arXiv preprint arXiv …, 2024 - arxiv.org
We study data corruption robustness for reinforcement learning with human feedback
(RLHF) in an offline setting. Given an offline dataset of pairs of trajectories along with …

Combining pessimism with optimism for robust and efficient model-based deep reinforcement learning

S Curi, I Bogunovic, A Krause - International Conference on …, 2021 - proceedings.mlr.press
In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that
are not present during training time. To ensure reliable performance, the RL agents need to …