查看文章

arxiv.org 中的 [PDF]

Rule induction for global explanation of trained models

作者

Madhumita Sushil, Simon Šuster, Walter Daelemans

发表日期

2018/8/29

期刊

BlackboxNLP, EMNLP

简介

Understanding the behavior of a trained network and finding explanations for its outputs is important for improving the network's performance and generalization ability, and for ensuring trust in automated systems. Several approaches have previously been proposed to identify and visualize the most important features by analyzing a trained network. However, the relations between different features and classes are lost in most cases. We propose a technique to induce sets of if-then-else rules that capture these relations to globally explain the predictions of a network. We first calculate the importance of the features in the trained network. We then weigh the original inputs with these feature importance scores, simplify the transformed input space, and finally fit a rule induction model to explain the model predictions. We find that the output rule-sets can explain the predictions of a neural network trained for 4-class text classification from the 20 newsgroups dataset to a macro-averaged F-score of 0.80. We make the code available at https://github.com/clips/interpret_with_rules.

引用总数

被引用次数：29

2019202020212022202320242 6 7 3 4 6

学术搜索中的文章

Rule induction for global explanation of trained models

M Sushil, S Šuster, W Daelemans - arXiv preprint arXiv:1808.09744, 2018

被引用次数：29 相关文章所有 9 个版本