Markov decision processes with risk-sensitive criteria: an overview

N Bäuerle, A Jaśkiewicz - Mathematical Methods of Operations Research, 2024 - Springer
The paper provides an overview of the theory and applications of risk-sensitive Markov
decision processes. The term'risk-sensitive'refers here to the use of the Optimized Certainty …

Robust risk-aware reinforcement learning

S Jaimungal, SM Pesenti, YS Wang, H Tatsat - SIAM Journal on Financial …, 2022 - SIAM
We present a reinforcement learning (RL) approach for robust optimization of risk-aware
performance criteria. To allow agents to express a wide variety of risk-reward profiles, we …

Conditionally elicitable dynamic risk measures for deep reinforcement learning

A Coache, S Jaimungal, Á Cartea - SIAM Journal on Financial Mathematics, 2023 - SIAM
We propose a novel framework to solve risk-sensitive reinforcement learning problems
where the agent optimizes time-consistent dynamic spectral risk measures. Based on the …

Reinforcement learning with dynamic convex risk measures

A Coache, S Jaimungal - Mathematical Finance, 2024 - Wiley Online Library
We develop an approach for solving time‐consistent risk‐sensitive stochastic optimization
problems using model‐free reinforcement learning (RL). Specifically, we assume agents …

Risk-sensitive markov decision process and learning under general utility functions

Z Wu, R Xu - arXiv preprint arXiv:2311.13589, 2023 - arxiv.org
Reinforcement Learning (RL) has gained substantial attention across diverse application
domains and theoretical investigations. Existing literature on RL theory largely focuses on …

On the global convergence of risk-averse policy gradient methods with expected conditional risk measures

X Yu, L Ying - International Conference on Machine …, 2023 - proceedings.mlr.press
Risk-sensitive reinforcement learning (RL) has become a popular tool to control the risk of
uncertain outcomes and ensure reliable performance in various sequential decision-making …

Sequential Decision-Making under Uncertainty: A Robust MDPs review

W Ou, S Bi - arXiv preprint arXiv:2404.00940, 2024 - arxiv.org
This review paper provides an in-depth overview of the evolution and advancements in
Robust Markov Decision Processes (RMDPs), a field of paramount importance for its role in …

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

Y Luo, Y Pan, H Wang, P Torr, P Poupart - arXiv preprint arXiv:2403.11062, 2024 - arxiv.org
Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional
Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their …

Markov Chain Variance Estimation: A Stochastic Approximation Approach

S Agrawal, ST Maguluri - arXiv preprint arXiv:2409.05733, 2024 - arxiv.org
We consider the problem of estimating the asymptotic variance of a function defined on a
Markov chain, an important step for statistical inference of the stationary mean. We design a …

Risk-Averse Finetuning of Large Language Models

S Chaudhary, U Dinesha, D Kalathil… - arXiv preprint arXiv …, 2025 - arxiv.org
We consider the challenge of mitigating the generation of negative or toxic content by the
Large Language Models (LLMs) in response to certain prompts. We propose integrating risk …