CY Wei, C Dann, J Zimmert - International Conference on …, 2022 - proceedings.mlr.press
We develop a model selection approach to tackle reinforcement learning with adversarial corruption in both transition and reward. For finite-horizon tabular MDPs, without prior …
T Nguyen, TM Luu, T Ton, CD Yoo - arXiv preprint arXiv:2405.11206, 2024 - arxiv.org
Offline reinforcement learning (RL) addresses the challenge of expensive and high-risk data exploration inherent in RL by pre-training policies on vast amounts of offline data, enabling …
We study the problem of robust reinforcement learning under adversarial corruption on both rewards and transitions. Our attack model assumes an\textit {adaptive} adversary who can …
In light of the burgeoning success of reinforcement learning (RL) in diverse real-world applications, considerable focus has been directed towards ensuring RL policies are robust …
As reinforcement learning (RL) has achieved great success and been even adopted in safety-critical domains such as autonomous vehicles, a range of empirical studies have …
Y Chen, S Du, K Jamieson - International Conference on …, 2021 - proceedings.mlr.press
We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system. We propose new …
N Jiang - arXiv preprint arXiv:2404.09946, 2024 - arxiv.org
This note clarifies some confusions (and perhaps throws out more) around model-based reinforcement learning and their theoretical understanding in the context of deep RL. Main …
We study data corruption robustness for reinforcement learning with human feedback (RLHF) in an offline setting. Given an offline dataset of pairs of trajectories along with …
In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to …