Reinforcement learning for selective key applications in power systems: Recent advances and future challenges

X Chen, G Qu, Y Tang, S Low… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
With large-scale integration of renewable generation and distributed energy resources,
modern power systems are confronted with new operational challenges, such as growing …

Recent advances in reinforcement learning in finance

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Online robust reinforcement learning with model uncertainty

Y Wang, S Zou - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc
Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …

Crpo: A new approach for safe reinforcement learning with convergence guarantee

T Xu, Y Liang, G Lan - International Conference on Machine …, 2021 - proceedings.mlr.press
In safe reinforcement learning (SRL) problems, an agent explores the environment to
maximize an expected total reward and meanwhile avoids violation of certain constraints on …

A finite-time analysis of two time-scale actor-critic methods

YF Wu, W Zhang, P Xu, Q Gu - Advances in Neural …, 2020 - proceedings.neurips.cc
Actor-critic (AC) methods have exhibited great empirical success compared with other
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …

Improving sample complexity bounds for (natural) actor-critic algorithms

T Xu, Z Wang, Y Liang - Advances in Neural Information …, 2020 - proceedings.neurips.cc
The actor-critic (AC) algorithm is a popular method to find an optimal policy in reinforcement
learning. In the infinite horizon scenario, the finite-sample convergence rate for the AC and …

On finite-time convergence of actor-critic algorithm

S Qiu, Z Yang, J Ye, Z Wang - IEEE Journal on Selected Areas …, 2021 - ieeexplore.ieee.org
Actor-critic algorithm and their extensions have made great achievements in real-world
decision-making problems. In contrast to its empirical success, the theoretical understanding …

Learning multi-agent behaviors from distributed and streaming demonstrations

S Liu, M Zhu - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
This paper considers the problem of inferring the behaviors of multiple interacting experts by
estimating their reward functions and constraints where the distributed demonstrated …

[PDF][PDF] Reinforcement learning for decision-making and control in power systems: Tutorial, review, and vision

X Chen, G Qu, Y Tang, S Low… - arXiv preprint arXiv …, 2021 - authors.library.caltech.edu
With large-scale integration of renewable generation and distributed energy resources
(DERs), modern power systems are confronted with new operational challenges, such as …

A general sample complexity analysis of vanilla policy gradient

R Yuan, RM Gower, A Lazaric - International Conference on …, 2022 - proceedings.mlr.press
We adapt recent tools developed for the analysis of Stochastic Gradient Descent (SGD) in
non-convex optimization to obtain convergence and sample complexity guarantees for the …