Interpretability is in the mind of the beholder: A causal framework for human-interpretable representation learning

E Marconato, A Passerini, S Teso - Entropy, 2023 - mdpi.com
Research on Explainable Artificial Intelligence has recently started exploring the idea of
producing explanations that, rather than being expressed in terms of low-level features, are …

Improving intervention efficacy via concept realignment in concept bottleneck models

N Singhi, JM Kim, K Roth, Z Akata - European Conference on Computer …, 2025 - Springer
Abstract Concept Bottleneck Models (CBMs) ground image classification on human-
understandable concepts to allow for interpretable model decisions as well as human …

Do Concept Bottleneck Models Obey Locality?

N Raman, ME Zarlenga, J Heo… - XAI in Action: Past …, 2023 - openreview.net
Concept-based learning improves a deep learning model's interpretability by explaining its
predictions via human-understandable concepts. Deep learning models trained under this …

Stochastic concept bottleneck models

M Vandenhirtz, S Laguna, R Marcinkevičs… - arXiv preprint arXiv …, 2024 - arxiv.org
Concept Bottleneck Models (CBMs) have emerged as a promising interpretable method
whose final prediction is based on intermediate, human-understandable concepts rather …

A survey on Concept-based Approaches For Model Improvement

A Gupta, PJ Narayanan - arXiv preprint arXiv:2403.14566, 2024 - arxiv.org
The focus of recent research has shifted from merely increasing the Deep Neural Networks
(DNNs) performance in various tasks to DNNs, which are more interpretable to humans. The …

Transparent anomaly detection via concept-based explanations

LR Sevyeri, I Sheth, F Farahnak, SE Kahou… - arXiv preprint arXiv …, 2023 - arxiv.org
Advancements in deep learning techniques have given a boost to the performance of
anomaly detection. However, real-world and safety-critical applications demand a level of …

Concept graph embedding models for enhanced accuracy and interpretability

S Kim, BC Ko - Machine Learning: Science and Technology, 2024 - iopscience.iop.org
In fields requiring high accountability, it is necessary to understand how deep-learning
models make decisions when analyzing the causes of image classification. Concept-based …

Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

Z Ye, S Milani, GJ Gordon, F Fang - arXiv preprint arXiv:2407.15786, 2024 - arxiv.org
Recent advances in reinforcement learning (RL) have predominantly leveraged neural
network-based policies for decision-making, yet these models often lack interpretability …

Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation

KM Collins, N Kim, Y Bitton, V Rieser… - Proceedings of the …, 2024 - ojs.aaai.org
Human feedback plays a critical role in learning and refining reward models for text-to-
image generation, but the optimal form the feedback should take for learning an accurate …

Efficient Bias Mitigation Without Privileged Information

ME Zarlenga, S Sankaranarayanan… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep neural networks trained via empirical risk minimisation often exhibit significant
performance disparities across groups, particularly when group and task labels are …