Distributionally robust model-based reinforcement learning with large state spaces

L Shi, G Li, Y Wei, Y Chen… - Advances in Neural …, 2024 - proceedings.neurips.cc

This paper investigates model robustness in reinforcement learning (RL) via the framework
of distributionally robust Markov decision processes (RMDPs). Despite recent efforts, the …

被引用次数：38 相关文章所有 10 个版本

[PDF] neurips.cc

Seeing is not believing: Robust reinforcement learning against spurious correlation

W Ding, L Shi, Y Chi, D Zhao - Advances in Neural …, 2024 - proceedings.neurips.cc

Robustness has been extensively studied in reinforcement learning (RL) to handle various
forms of uncertainty such as random perturbations, rare events, and malicious attacks. In this …

被引用次数：15 相关文章所有 7 个版本

[PDF] arxiv.org

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

L Shi, E Mazumdar, Y Chi, A Wierman - arXiv preprint arXiv:2404.18909, 2024 - arxiv.org

To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must
maintain robustness against environmental uncertainties. While robust RL has been widely …

被引用次数：9 相关文章所有 5 个版本

[PDF] arxiv.org

Sample complexity of offline distributionally robust linear markov decision processes

H Wang, L Shi, Y Chi - arXiv preprint arXiv:2403.12946, 2024 - arxiv.org

In offline reinforcement learning (RL), the absence of active exploration calls for attention on
the model robustness to tackle the sim-to-real gap, where the discrepancy between the …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Distributionally Robust Constrained Reinforcement Learning under Strong Duality

Z Zhang, K Panaganti, L Shi, Y Sui, A Wierman… - arXiv preprint arXiv …, 2024 - arxiv.org

We study the problem of Distributionally Robust Constrained RL (DRC-RL), where the goal
is to maximize the expected reward subject to environmental distribution shifts and …

被引用次数：1 相关文章所有 4 个版本

[PDF] openreview.net

Sample Efficient Reinforcement Learning from Human Feedback via Active Exploration

V Mehta, V Das, O Neopane, Y Dai, I Bogunovic… - 2023 - openreview.net

Preference-based feedback is important for many applications in reinforcement learning
where direct evaluation of a reward function is not feasible. A notable recent example arises …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Tractable Equilibrium Computation in Markov Games through Risk Aversion

E Mazumdar, K Panaganti, L Shi - arXiv preprint arXiv:2406.14156, 2024 - arxiv.org

A significant roadblock to the development of principled multi-agent reinforcement learning
is the fact that desired solution concepts like Nash equilibria may be intractable to compute …

[PDF] arxiv.org

Sensor-Based Distributionally Robust Control for Safe Robot Navigation in Dynamic Environments

K Long, Y Yi, Z Dai, S Herbert, J Cortés… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce a novel method for safe mobile robot navigation in dynamic, unknown
environments, utilizing onboard sensing to impose safety constraints without the need for …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Wasserstein Distributionally Robust Control and State Estimation for Partially Observable Linear Systems

M Jang, A Hakobyan, I Yang - arXiv preprint arXiv:2406.01723, 2024 - arxiv.org

This paper presents a novel Wasserstein distributionally robust control and state estimation
algorithm for partially observable linear stochastic systems, where the probability …

Sample-Efficient Reinforcement Learning with Applications in Nuclear Fusion

V Mehta - kilthub.cmu.edu

In many practical applications of reinforcement learning (RL), it is expensive to observe state
transitions from the environment. In the problem of plasma control for nuclear fusion, the …

高级搜索

QQ 群