Promoting or hindering: Stealthy black-box attacks against drl-based traffic signal control

Y Ren, H Zhang, X Cao, C Yang… - IEEE Internet of Things …, 2023 - ieeexplore.ieee.org
Numerous studies have demonstrated, in-depth, the vulnerability of the deep reinforcement
learning (DRL) model's elements (eg, reward), which is a factor limiting the widespread …

On the near-optimality of local policies in large cooperative multi-agent reinforcement learning

WU Mondal, V Aggarwal, SV Ukkusuri - arXiv preprint arXiv:2209.03491, 2022 - arxiv.org
We show that in a cooperative $ N $-agent network, one can design locally executable
policies for the agents such that the resulting discounted sum of average rewards (value) …

Randomized Linear Programming for Tabular Average-Cost Multi-agent Reinforcement Learning

A Koppel, AS Bedi, B Ganguly… - 2021 55th Asilomar …, 2021 - ieeexplore.ieee.org
We focus on multi-agent reinforcement learning in tabular average-cost settings: a team of
agents sequentially interacts with the environment and observes localized incentives. The …

On Optimization Formulations of Finite Horizon MDPs

RV Dwaraknath, L Ying - OPT 2023: Optimization for Machine Learning - openreview.net
In this paper, we extend the connection between linear programming formulations of MDPs
and policy gradient methods for infinite horizon MDPs presented in (Ying, L., & Zhu, Y …