Policy gradient algorithms for robust mdps with non-rectangular uncertainty sets

M Li, D Kuhn, T Sutter - arXiv preprint arXiv:2305.19004, 2023 - arxiv.org
We propose policy gradient algorithms for robust infinite-horizon Markov decision processes
(MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the …

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

L Shi, E Mazumdar, Y Chi, A Wierman - arXiv preprint arXiv:2404.18909, 2024 - arxiv.org
To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must
maintain robustness against environmental uncertainties. While robust RL has been widely …

Sample complexity of offline distributionally robust linear markov decision processes

H Wang, L Shi, Y Chi - arXiv preprint arXiv:2403.12946, 2024 - arxiv.org
In offline reinforcement learning (RL), the absence of active exploration calls for attention on
the model robustness to tackle the sim-to-real gap, where the discrepancy between the …

Distributionally Robust Constrained Reinforcement Learning under Strong Duality

Z Zhang, K Panaganti, L Shi, Y Sui, A Wierman… - arXiv preprint arXiv …, 2024 - arxiv.org
We study the problem of Distributionally Robust Constrained RL (DRC-RL), where the goal
is to maximize the expected reward subject to environmental distribution shifts and …

Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm

M Lu, H Zhong, T Zhang, J Blanchet - arXiv preprint arXiv:2404.03578, 2024 - arxiv.org
The sim-to-real gap, which represents the disparity between training and testing
environments, poses a significant challenge in reinforcement learning (RL). A promising …

A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes

Z Lin, C Xue, Q Deng, Y Ye - arXiv preprint arXiv:2406.00274, 2024 - arxiv.org
Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable
and promising approach to discovering a policy with creditable performance, particularly in …

Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces

Z Chen, H Huang - Forty-first International Conference on Machine … - openreview.net
Robust Markov decision process (robust MDP) is an important machine learning framework
to make a reliable policy that is robust to environmental perturbation. Despite empirical …