multiple agents aim to cooperatively maximize the globally averaged return through
communication with only local neighbors. An asynchronous multi-agent actor-critic algorithm
is proposed for possibly unidirectional communication relationships depicted by a directed
graph. Each agent independently updates its variables at “event times” determined by its
own clock. It is not assumed that the agents' clocks are synchronized or that the event times …