- 学术资源搜索

Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond

X Li, H Xiong, X Li, X Wu, X Zhang, J Liu, J Bian… - … and Information Systems, 2022 - Springer

Deep neural networks have been well-known for their superb handling of various machine
learning and artificial intelligence tasks. However, due to their over-parameterized black-box …

被引用次数：265 相关文章所有 7 个版本

[PDF] wiley.com

Interpretable and explainable machine learning: a methods‐centric overview with concrete examples

R Marcinkevičs, JE Vogt - Wiley Interdisciplinary Reviews: Data …, 2023 - Wiley Online Library

Interpretability and explainability are crucial for machine learning (ML) and statistical
applications in medicine, economics, law, and natural sciences and form an essential …

被引用次数：38 相关文章所有 4 个版本

[PDF] thecvf.com

Transformer interpretability beyond attention visualization

H Chefer, S Gur, L Wolf - … of the IEEE/CVF conference on …, 2021 - openaccess.thecvf.com

Self-attention techniques, and specifically Transformers, are dominating the field of text
processing and are becoming increasingly popular in computer vision classification tasks. In …

被引用次数：641 相关文章所有 6 个版本

[PDF] neurips.cc

Backdoorbench: A comprehensive benchmark of backdoor learning

B Wu, H Chen, M Zhang, Z Zhu, S Wei… - Advances in …, 2022 - proceedings.neurips.cc

Backdoor learning is an emerging and vital topic for studying deep neural networks'
vulnerability (DNNs). Many pioneering backdoor attack and defense methods are being …

被引用次数：91 相关文章所有 6 个版本

[PDF] thecvf.com

Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers

H Chefer, S Gur, L Wolf - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Transformers are increasingly dominating multi-modal reasoning tasks, such as visual
question answering, achieving state-of-the-art results thanks to their ability to contextualize …

被引用次数：235 相关文章所有 5 个版本

[PDF] neurips.cc

Diffusion visual counterfactual explanations

M Augustin, V Boreiko, F Croce… - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Visual Counterfactual Explanations (VCEs) are an important tool to understand the
decisions of an image classifier. They are “small” but “realistic” semantic changes of the …

被引用次数：56 相关文章所有 7 个版本

[PDF] neurips.cc

Which explanation should i choose? a function approximation perspective to characterizing post hoc explanations

T Han, S Srinivas, H Lakkaraju - Advances in neural …, 2022 - proceedings.neurips.cc

A critical problem in the field of post hoc explainability is the lack of a common foundational
goal among methods. For example, some methods are motivated by function approximation …

被引用次数：74 相关文章所有 7 个版本

[PDF] mlr.press

XAI for transformers: Better explanations through conservative propagation

A Ali, T Schnake, O Eberle… - International …, 2022 - proceedings.mlr.press

Transformers have become an important workhorse of machine learning, with numerous
applications. This necessitates the development of reliable methods for increasing their …

被引用次数：75 相关文章所有 5 个版本

ISTVT: interpretable spatial-temporal video transformer for deepfake detection

C Zhao, C Wang, G Hu, H Chen, C Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

With the rapid development of Deepfake synthesis technology, our information security and
personal privacy have been severely threatened in recent years. To achieve a robust …

被引用次数：51 相关文章

[PDF] pnas.org Full View

Impossibility theorems for feature attribution

B Bilodeau, N Jaques, PW Koh… - Proceedings of the …, 2024 - National Acad Sciences

Despite a sea of interpretability methods that can produce plausible explanations, the field
has also empirically seen many failure cases of such methods. In light of these results, it …

被引用次数：39 相关文章所有 8 个版本

高级搜索

QQ 群

Interpretable deep learning: Interpretation, interpretability, trustworthiness, and beyond

Interpretable and explainable machine learning: a methods‐centric overview with concrete examples

Transformer interpretability beyond attention visualization

Backdoorbench: A comprehensive benchmark of backdoor learning

Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers

Diffusion visual counterfactual explanations

Which explanation should i choose? a function approximation perspective to characterizing post hoc explanations

XAI for transformers: Better explanations through conservative propagation

ISTVT: interpretable spatial-temporal video transformer for deepfake detection

Impossibility theorems for feature attribution

引用