[图书][B] Markov decision processes with applications to finance

N Bäuerle, U Rieder - 2011 - books.google.com
The theory of Markov decision processes focuses on controlled Markov chains in discrete
time. The authors establish the theory for general state and action spaces and at the same …

Safe policies for reinforcement learning via primal-dual methods

S Paternain, M Calvo-Fullana… - … on Automatic Control, 2022 - ieeexplore.ieee.org
In this article, we study the design of controllers in the context of stochastic optimal control
under the assumption that the model of the system is not available. This is, we aim to control …

Sensitivity of multiperiod optimization problems with respect to the adapted Wasserstein distance

D Bartl, J Wiesel - SIAM Journal on Financial Mathematics, 2023 - SIAM
We analyze the effect of small changes in the underlying probabilistic model on the value of
multiperiod stochastic optimization problems and optimal stopping problems. We work in …

Utility maximization under model uncertainty in discrete time

M Nutz - Mathematical Finance, 2016 - Wiley Online Library
We give a general formulation of the utility maximization problem under nondominated
model uncertainty in discrete time and show that an optimal portfolio exists for any utility …

Learning safe policies via primal-dual methods

S Paternain, M Calvo-Fullana… - 2019 IEEE 58th …, 2019 - ieeexplore.ieee.org
In this paper, we study the learning of safe policies in the setting of reinforcement learning
problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not …

Stochastic policy gradient ascent in reproducing kernel hilbert spaces

S Paternain, JA Bazerque, A Small… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Reinforcement learning consists of finding policies that maximize an expected cumulative
long-term reward in a Markov decision process with unknown transition probabilities and …

Discrete time optimal investment under model uncertainty

L Carassus, M Ferhoune - arXiv preprint arXiv:2307.11919, 2023 - arxiv.org
We study a robust utility maximization problem in a general discrete-time frictionless market
under quasi-sure no-arbitrage. The investor is assumed to have a random and concave …

Sensitivity of multiperiod optimization problems in adapted Wasserstein distance

D Bartl, J Wiesel - arXiv preprint arXiv:2208.05656, 2022 - arxiv.org
We analyze the effect of small changes in the underlying probabilistic model on the value of
multi-period stochastic optimization problems and optimal stopping problems. We work in …

Mean‐portfolio selection and‐arbitrage for coherent risk measures

M Herdegen, N Khan - Mathematical Finance, 2022 - Wiley Online Library
We revisit mean‐risk portfolio selection in a one‐period financial market where risk is
quantified by a positively homogeneous risk measure. We first show that under mild …

How non-arbitrage, viability and numéraire portfolio are related

T Choulli, J Deng, J Ma - Finance and Stochastics, 2015 - Springer
This paper proposes two approaches that quantify the exact relationship among viability,
absence of arbitrage, and/or existence of the numéraire portfolio under minimal assumptions …