Exploring counterfactual explanations through the lens of adversarial examples: A theoretical and empirical analysis

M Pawelczyk, C Agarwal, S Joshi… - International …, 2022 - proceedings.mlr.press
As machine learning (ML) models becomemore widely deployed in high-stakes
applications, counterfactual explanations have emerged as key tools for providing …

Distal explanations for model-free explainable reinforcement learning

P Madumal, T Miller, L Sonenberg, F Vetere - arXiv preprint arXiv …, 2020 - arxiv.org
In this paper we introduce and evaluate a distal explanation model for model-free
reinforcement learning agents that can generate explanations forwhy'andwhy not'questions …

Counterfactual explanations using optimization with constraint learning

D Maragno, TE Röber, I Birbil - arXiv preprint arXiv:2209.10997, 2022 - arxiv.org
To increase the adoption of counterfactual explanations in practice, several criteria that
these should adhere to have been put forward in the literature. We propose counterfactual …

Explaining reinforcement learning agents through counterfactual action outcomes

Y Amitai, Y Septon, O Amir - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Explainable reinforcement learning (XRL) methods aim to help elucidate agent policies and
decision-making processes. The majority of XRL approaches focus on local explanations …

Generation of policy-level explanations for reinforcement learning

N Topin, M Veloso - Proceedings of the AAAI Conference on Artificial …, 2019 - ojs.aaai.org
Though reinforcement learning has greatly benefited from the incorporation of neural
networks, the inability to verify the correctness of such systems limits their use. Current work …

Explainability in deep reinforcement learning

A Heuillet, F Couthouis, N Díaz-Rodríguez - Knowledge-Based Systems, 2021 - Elsevier
A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature
relevance techniques to explain a deep neural network (DNN) output or explaining models …

Ganterfactual-rl: Understanding reinforcement learning agents' strategies through visual counterfactual explanations

T Huber, M Demmler, S Mertes, ML Olson… - arXiv preprint arXiv …, 2023 - arxiv.org
Counterfactual explanations are a common tool to explain artificial intelligence models. For
Reinforcement Learning (RL) agents, they answer" Why not?" or" What if?" questions by …

Designing counterfactual generators using deep model inversion

J Thiagarajan, VS Narayanaswamy… - Advances in …, 2021 - proceedings.neurips.cc
Explanation techniques that synthesize small, interpretable changes to a given image while
producing desired changes in the model prediction have become popular for introspecting …

A novel policy-graph approach with natural language and counterfactual abstractions for explaining reinforcement learning agents

T Liu, J McCalmon, T Le, MA Rahman, D Lee… - Autonomous Agents and …, 2023 - Springer
As reinforcement learning (RL) continues to improve and be applied in situations alongside
humans, the need to explain the learned behaviors of RL agents to end-users becomes …

Counterfactual explanation trees: Transparent and consistent actionable recourse with decision trees

K Kanamori, T Takagi… - … Conference on Artificial …, 2022 - proceedings.mlr.press
Counterfactual Explanation (CE) is a post-hoc explanation method that provides a
perturbation for altering the prediction result of a classifier. An individual can interpret the …