Explainability in deep reinforcement learning

A Heuillet, F Couthouis, N Díaz-Rodríguez - Knowledge-Based Systems, 2021 - Elsevier
A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature
relevance techniques to explain a deep neural network (DNN) output or explaining models …

A survey on explainable reinforcement learning: Concepts, algorithms, challenges

Y Qing, S Liu, J Song, H Wang, M Song - arXiv preprint arXiv:2211.06665, 2022 - arxiv.org
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent
agents interact with the environment to fulfill a long-term goal. Driven by the resurgence of …

How to reuse and compose knowledge for a lifetime of tasks: A survey on continual learning and functional composition

JA Mendez, E Eaton - arXiv preprint arXiv:2207.07730, 2022 - arxiv.org
A major goal of artificial intelligence (AI) is to create an agent capable of acquiring a general
understanding of the world. Such an agent would require the ability to continually …

On the expressivity of markov reward

D Abel, W Dabney, A Harutyunyan… - Advances in …, 2021 - proceedings.neurips.cc
Reward is the driving force for reinforcement-learning agents. This paper is dedicated to
understanding the expressivity of reward as a way to capture tasks that we would want an …

A survey on interpretable reinforcement learning

C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao… - Machine Learning, 2024 - Springer
Although deep reinforcement learning has become a promising machine learning approach
for sequential decision-making problems, it is still not mature enough for high-stake domains …

Autotelic agents with intrinsically motivated goal-conditioned reinforcement learning: a short survey

C Colas, T Karch, O Sigaud, PY Oudeyer - Journal of Artificial Intelligence …, 2022 - jair.org
Building autonomous machines that can explore open-ended environments, discover
possible interactions and build repertoires of skills is a general objective of artificial …

Optimistic linear support and successor features as a basis for optimal policy transfer

LN Alegre, A Bazzan… - … conference on machine …, 2022 - proceedings.mlr.press
In many real-world applications, reinforcement learning (RL) agents might have to solve
multiple tasks, each one typically modeled via a reward function. If reward functions are …

Constraint-conditioned policy optimization for versatile safe reinforcement learning

Y Yao, Z Liu, Z Cen, J Zhu, W Yu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Safe reinforcement learning (RL) focuses on training reward-maximizing agents subject to
pre-defined safety constraints. Yet, learning versatile safe policies that can adapt to varying …

Mocoda: Model-based counterfactual data augmentation

S Pitis, E Creager, A Mandlekar… - Advances in Neural …, 2022 - proceedings.neurips.cc
The number of states in a dynamic process is exponential in the number of objects, making
reinforcement learning (RL) difficult in complex, multi-object domains. For agents to scale to …

Diversifying ai: Towards creative chess with alphazero

T Zahavy, V Veeriah, S Hou, K Waugh, M Lai… - arXiv preprint arXiv …, 2023 - arxiv.org
In recent years, Artificial Intelligence (AI) systems have surpassed human intelligence in a
variety of computational tasks. However, AI systems, like humans, make mistakes, have …