Efficient policy iteration for robust markov decision processes via regularization

N Kumar, K Levy, K Wang, S Mannor - arXiv preprint arXiv:2205.14327, 2022 - arxiv.org
Robust Markov decision processes (MDPs) provide a general framework to model decision
problems where the system dynamics are changing or only partially known. Efficient …

Safe model‐based reinforcement learning for nonlinear optimal control with state and input constraints

Y Kim, JW Kim - AIChE Journal, 2022 - Wiley Online Library
Safety is a critical factor in reinforcement learning (RL) in chemical processes. In our
previous work, we had proposed a new stability‐guaranteed RL for unconstrained nonlinear …

Temple: Learning template of transitions for sample efficient multi-task rl

Y Sun, X Yin, F Huang - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Transferring knowledge among various environments is important for efficiently learning
multiple tasks online. Most existing methods directly use the previously learned models or …

Exploration in reward machines with low regret

H Bourel, A Jonsson, OA Maillard… - International …, 2023 - proceedings.mlr.press
We study reinforcement learning (RL) for decision processes with non-Markovian reward, in
which high-level knowledge in the form of reward machines is available to the learner …

Scaling up q-learning via exploiting state–action equivalence

Y Lyu, A Côme, Y Zhang, MS Talebi - Entropy, 2023 - mdpi.com
Recent success stories in reinforcement learning have demonstrated that leveraging
structural properties of the underlying environment is key in devising viable methods …

A simple approach for state-action abstraction using a learned mdp homomorphism

AN Mavor-Parker, MJ Sargent, A Banino… - arXiv preprint arXiv …, 2022 - arxiv.org
Animals are able to rapidly infer from limited experience when sets of state action pairs have
equivalent reward and transition dynamics. On the other hand, modern reinforcement …

An Efficient Solution to s-Rectangular Robust Markov Decision Processes

N Kumar, K Levy, K Wang, S Mannor - arXiv preprint arXiv:2301.13642, 2023 - arxiv.org
We present an efficient robust value iteration for\texttt {s}-rectangular robust Markov Decision
Processes (MDPs) with a time complexity comparable to standard (non-robust) MDPs which …

How to Shrink Confidence Sets for Many Equivalent Discrete Distributions?

OA Maillard, MS Talebi - arXiv preprint arXiv:2407.15662, 2024 - arxiv.org
We consider the situation when a learner faces a set of unknown discrete distributions
$(p_k) _ {k\in\mathcal K} $ defined over a common alphabet $\mathcal X $, and can build for …

Efficient Value Iteration for s-rectangular Robust Markov Decision Processes

N Kumar, K Wang, KY Levy, S Mannor - Forty-first International Conference … - openreview.net
We focus on s-rectangular robust Markov decision processes (MDPs), which capture
interconnected uncertainties across different actions within each state. This framework is …

Towards Robust and Adaptable Real-World Reinforcement Learning

Y Sun - 2023 - search.proquest.com
The past decade has witnessed a rapid development of reinforcement learning (RL)
techniques. However, there is still a gap between employing RL in simulators and applying …