Artificial neural networks in prognostics and health management (PHM), especially in intelligent fault diagnosis (IFD) have made great progress but possess black-box nature, leading to lack of interpretability and weak robustness when facing complex environment variations. When environment changes, the model tends to make wrong decisions leading to a cost, especially for major equipment if easily trusted by the users. Researchers have made studies on eXplainable Artificial Intelligence (XAI) based IFD to better understand the models. Most of them express their interpretability in the way of drawing gradient-based saliency maps to show where the model focuses on, which is of little consideration for causal effect and not sparse enough without quantitative metrics. To address these issues, we design an XAI method that utilizes a neural network as an instance-wise feature selector to select frequency bands that have stronger causal strength with the diagnosis result than others and further explain the diagnosis model. We quantify causal strength with the relative entropy distance (RED) and treat the simplified RED as the objective function for the optimization of the selector model. Finally, our experiments demonstrate the superiority of our method over another algorithm L2X measured by post-hoc accuracy (PHA), variant average causal effect (ACE), and vision plots.