查看文章

researchgate.net 中的 [PDF]

Explain the explainer: Interpreting model-agnostic counterfactual explanations of a deep reinforcement learning agent

作者

Ziheng Chen, Fabrizio Silvestri, Gabriele Tolomei, Jia Wang, He Zhu, Hongshik Ahn

发表日期

2022/11/23

期刊

IEEE Transactions on Artificial Intelligence

卷号

期号

页码范围

1443-1457

出版商

IEEE

简介

Counterfactual examples (CFs) are one of the most popular methods for attaching post hoc explanations to machine learning models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood; thus, they are hard to generalize for complex models and inefficient for large datasets. This article aims to overcome these limitations and introduces ReLAX , a model-agnostic algorithm to generate optimal counterfactual explanations. Specifically, we formulate the problem of crafting CFs as a sequential decision-making task. We then find the optimal CFs via deep reinforcement learning (DRL) with discrete-continuous hybrid action space. In addition, we develop a distillation algorithm to extract decision rules from the DRL agent's policy in the form of a decision tree to make the process of generating CFs itself interpretable. Extensive experiments …

引用总数

被引用次数：18

2022202320241 11 6

学术搜索中的文章

Explain the explainer: Interpreting model-agnostic counterfactual explanations of a deep reinforcement learning agent

Z Chen, F Silvestri, G Tolomei, J Wang, H Zhu, H Ahn - IEEE Transactions on Artificial Intelligence, 2022

被引用次数：18 相关文章所有 7 个版本