Abstract Robust Markov decision processes (MDPs) aim to handle changing or partially known system dynamics. To solve them, one typically resorts to robust optimization methods …
Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the …
Robust Markov decision processes (MDPs) are used for applications of dynamic optimization in uncertain environments and have been studied extensively. Many of the …
H Husain, J Knoblauch - International Conference on …, 2022 - proceedings.mlr.press
We build on the optimization-centric view on Bayesian inference advocated by Knoblauch et al.(2019). Thinking about Bayesian and generalized Bayesian posteriors as the solutions to …
In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set. By targeting maximal return under the most …
We study robust Markov games (RMG) with $ s $-rectangular uncertainty. We show a general equivalence between computing a robust Nash equilibrium (RNE) of a $ s …
Mirror descent value iteration (MDVI), an abstraction of Kullback-Leibler (KL) and entropy- regularized reinforcement learning (RL), has served as the basis for recent high-performing …
We extend temporal-difference (TD) learning in order to obtain risk-sensitive, model-free reinforcement learning algorithms. This extension can be regarded as modification of the …
Reinforcement learning (RL) is recognized as lacking generalization and robustness under environmental perturbations, which excessively restricts its application for real-world …