This survey is focused on certain sequential decision-making problems that involve optimizing over probability functions. We discuss the relevance of these problems for …
When faced with sequential decision-making problems, it is often useful to be able to predict what would happen if decisions were made using a new policy. Those predictions must …
R Wu, M Uehara, W Sun - International Conference on …, 2023 - proceedings.mlr.press
We study the problem of estimating the distribution of the return of a policy using an offline dataset that is not generated from the policy, ie, distributional offline policy evaluation (OPE) …
Most off-policy evaluation methods for contextual bandits have focused on the expected outcome of a policy, which is estimated via methods that at best provide only asymptotic …
Standard uniform convergence results bound the generalization gap of the expected loss over a hypothesis class. The emergence of risk-sensitive learning requires generalization …
Recommender systems are more and more often modelled as repeated decision making processes–deciding which (ranking of) items to recommend to a given user. Each decision …
Motivated by the fragility of neural network (NN) controllers in safety-critical applications, we present a data-driven framework for verifying the risk of stochastic dynamical systems with …
Tipping points are abrupt, drastic, and often irreversible changes in the evolution of non- stationary and chaotic dynamical systems. For instance, increased greenhouse gas …
H Liang, Z Luo - International Conference on Artificial …, 2024 - proceedings.mlr.press
We study finite episodic Markov decision processes incorporating dynamic risk measures to capture risk sensitivity. To this end, we present two model-based algorithms applied to\emph …