An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arXiv preprint arXiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Causal inference and counterfactual prediction in machine learning for actionable healthcare

M Prosperi, Y Guo, M Sperrin, JS Koopman… - Nature Machine …, 2020 - nature.com
Big data, high-performance computing, and (deep) machine learning are increasingly
becoming key to precision medicine—from identifying disease risks and taking preventive …

Deep counterfactual regret minimization

N Brown, A Lerer, S Gross… - … conference on machine …, 2019 - proceedings.mlr.press
Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large
imperfect-information games. It converges to an equilibrium by iteratively traversing the …

Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge

A Singla, S Padakandla… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
This paper presents our method for enabling a UAV quadrotor, equipped with a monocular
camera, to autonomously avoid collisions with obstacles in unstructured and unknown …

Actor-critic policy optimization in partially observable multiagent environments

S Srinivasan, M Lanctot, V Zambaldi… - Advances in neural …, 2018 - proceedings.neurips.cc
Optimization of parameterized policies for reinforcement learning (RL) is an important and
challenging problem in artificial intelligence. Among the most common approaches are …

Recent advances in deep reinforcement learning applications for solving partially observable markov decision processes (pomdp) problems: Part 1—fundamentals …

X Xiang, S Foo - Machine Learning and Knowledge Extraction, 2021 - mdpi.com
The first part of a two-part series of papers provides a survey on recent advances in Deep
Reinforcement Learning (DRL) applications for solving partially observable Markov decision …

Computing optimal equilibria and mechanisms via learning in zero-sum extensive-form games

B Zhang, G Farina, I Anagnostides… - Advances in …, 2024 - proceedings.neurips.cc
We introduce a new approach for computing optimal equilibria via learning in games. It
applies to extensive-form settings with any number of players, including mechanism design …

Dream: Deep regret minimization with advantage baselines and model-free learning

E Steinberger, A Lerer, N Brown - arXiv preprint arXiv:2006.10410, 2020 - arxiv.org
We introduce DREAM, a deep reinforcement learning algorithm that finds optimal strategies
in imperfect-information games with multiple agents. Formally, DREAM converges to a Nash …

Remix: Regret minimization for monotonic value function factorization in multiagent reinforcement learning

Y Mei, H Zhou, T Lan - arXiv preprint arXiv:2302.05593, 2023 - arxiv.org
Value function factorization methods have become a dominant approach for cooperative
multiagent reinforcement learning under a centralized training and decentralized execution …

Double neural counterfactual regret minimization

H Li, K Hu, Z Ge, T Jiang, Y Qi, L Song - arXiv preprint arXiv:1812.10607, 2018 - arxiv.org
Counterfactual Regret Minimization (CRF) is a fundamental and effective technique for
solving Imperfect Information Games (IIG). However, the original CRF algorithm only works …