Simulation optimization in the new era of AI

Y Peng, CH Chen, MC Fu - … the Frontiers of OR/MS: From …, 2023 - pubsonline.informs.org
We review simulation optimization methods and discuss how these methods underpin
modern artificial intelligence (AI) techniques. In particular, we focus on three areas …

Review of Large-Scale Simulation Optimization

W Fan, LJ Hong, G Jiang, J Luo - arXiv preprint arXiv:2403.15669, 2024 - arxiv.org
Large-scale simulation optimization (SO) problems encompass both large-scale ranking-
and-selection problems and high-dimensional discrete or continuous SO problems …

Quantile-based policy optimization for reinforcement learning

J Jiang, Y Peng, J Hu - 2022 Winter Simulation Conference …, 2022 - ieeexplore.ieee.org
Classical reinforcement learning (RL) aims to optimize the expected cumulative rewards. In
this work, we consider the RL setting where the goal is to optimize the quantile of the …

Quantile-based deep reinforcement learning using two-timescale policy gradient algorithms

J Jiang, J Hu, Y Peng - arXiv preprint arXiv:2305.07248, 2023 - arxiv.org
Classical reinforcement learning (RL) aims to optimize the expected cumulative reward. In
this work, we consider the RL setting where the goal is to optimize the quantile of the …

Simulation optimization of conditional value-at-risk

J Hu, M Song, MC Fu, Y Peng - IISE Transactions, 2024 - Taylor & Francis
Conditional value-at-risk (CVaR) is a well-established tool for measuring risk. In this article,
we consider solving CVaR optimization problems within a general simulation context. We …

A policy gradient approach for optimization of smooth risk measures

N Vijayan, LA Prashanth - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning
(RL) problem in on-policy as well as off-policy settings. We consider episodic Markov …

Tilted Quantile Gradient Updates for Quantile-Constrained Reinforcement Learning

C Li, G Ruan, H Geng - arXiv preprint arXiv:2412.13184, 2024 - arxiv.org
Safe reinforcement learning (RL) is a popular and versatile paradigm to learn reward-
maximizing policies with safety guarantees. Previous works tend to express the safety …

Generalized likelihood ratio method for stochastic models with uniform random numbers as inputs

Y Peng, MC Fu, J Hu, P L'Ecuyer, B Tuffin - European Journal of …, 2025 - Elsevier
We propose a new unbiased stochastic gradient estimator for a family of stochastic models
driven by uniform random numbers as inputs. Dropping the requirement that the tails of the …

Distortion Risk Measure-Based Deep Reinforcement Learning

J Jiang, B Heidergott, J Hu… - 2024 Winter Simulation …, 2024 - ieeexplore.ieee.org
Mainstream reinforcement learning (RL) typically focuses on maximizing expected
cumulative rewards. In this paper, we explore a risk-sensitive RL setting where the objective …

Eliminating Ratio Bias for Gradient-based Simulated Parameter Estimation

Z Li, Y Peng - arXiv preprint arXiv:2411.12995, 2024 - arxiv.org
This article addresses the challenge of parameter calibration in stochastic models where the
likelihood function is not analytically available. We propose a gradient-based simulated …