The curious price of distributional robustness in reinforcement learning with a generative model

L Shi, G Li, Y Wei, Y Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc
This paper investigates model robustness in reinforcement learning (RL) via the framework
of distributionally robust Markov decision processes (RMDPs). Despite recent efforts, the …

Seeing is not believing: Robust reinforcement learning against spurious correlation

W Ding, L Shi, Y Chi, D Zhao - Advances in Neural …, 2024 - proceedings.neurips.cc
Robustness has been extensively studied in reinforcement learning (RL) to handle various
forms of uncertainty such as random perturbations, rare events, and malicious attacks. In this …

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

L Shi, E Mazumdar, Y Chi, A Wierman - arXiv preprint arXiv:2404.18909, 2024 - arxiv.org
To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must
maintain robustness against environmental uncertainties. While robust RL has been widely …

Sample complexity of offline distributionally robust linear markov decision processes

H Wang, L Shi, Y Chi - arXiv preprint arXiv:2403.12946, 2024 - arxiv.org
In offline reinforcement learning (RL), the absence of active exploration calls for attention on
the model robustness to tackle the sim-to-real gap, where the discrepancy between the …

Distributionally Robust Constrained Reinforcement Learning under Strong Duality

Z Zhang, K Panaganti, L Shi, Y Sui, A Wierman… - arXiv preprint arXiv …, 2024 - arxiv.org
We study the problem of Distributionally Robust Constrained RL (DRC-RL), where the goal
is to maximize the expected reward subject to environmental distribution shifts and …

Sample Efficient Reinforcement Learning from Human Feedback via Active Exploration

V Mehta, V Das, O Neopane, Y Dai, I Bogunovic… - 2023 - openreview.net
Preference-based feedback is important for many applications in reinforcement learning
where direct evaluation of a reward function is not feasible. A notable recent example arises …

Tractable Equilibrium Computation in Markov Games through Risk Aversion

E Mazumdar, K Panaganti, L Shi - arXiv preprint arXiv:2406.14156, 2024 - arxiv.org
A significant roadblock to the development of principled multi-agent reinforcement learning
is the fact that desired solution concepts like Nash equilibria may be intractable to compute …

Sensor-Based Distributionally Robust Control for Safe Robot Navigation in Dynamic Environments

K Long, Y Yi, Z Dai, S Herbert, J Cortés… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce a novel method for safe mobile robot navigation in dynamic, unknown
environments, utilizing onboard sensing to impose safety constraints without the need for …

Wasserstein Distributionally Robust Control and State Estimation for Partially Observable Linear Systems

M Jang, A Hakobyan, I Yang - arXiv preprint arXiv:2406.01723, 2024 - arxiv.org
This paper presents a novel Wasserstein distributionally robust control and state estimation
algorithm for partially observable linear stochastic systems, where the probability …

Sample-Efficient Reinforcement Learning with Applications in Nuclear Fusion

V Mehta - kilthub.cmu.edu
In many practical applications of reinforcement learning (RL), it is expensive to observe state
transitions from the environment. In the problem of plasma control for nuclear fusion, the …