Toward transparent ai: A survey on interpreting the inner structures of deep neural networks

T Räuker, A Ho, S Casper… - 2023 ieee conference …, 2023 - ieeexplore.ieee.org
The last decade of machine learning has seen drastic increases in scale and capabilities.
Deep neural networks (DNNs) are increasingly being deployed in the real world. However …

Robust recommender system: a survey and future directions

K Zhang, Q Cao, F Sun, Y Wu, S Tao, H Shen… - arXiv preprint arXiv …, 2023 - arxiv.org
With the rapid growth of information, recommender systems have become integral for
providing personalized suggestions and overcoming information overload. However, their …

Explainable automated fact-checking: A survey

N Kotonya, F Toni - arXiv preprint arXiv:2011.03870, 2020 - arxiv.org
A number of exciting advances have been made in automated fact-checking thanks to
increasingly larger datasets and more powerful systems, leading to improvements in the …

On relating explanations and adversarial examples

A Ignatiev, N Narodytska… - Advances in neural …, 2019 - proceedings.neurips.cc
The importance of explanations (XP's) of machine learning (ML) model predictions and of
adversarial examples (AE's) cannot be overstated, with both arguably being essential for the …

Who needs explanation and when? Juggling explainable AI and user epistemic uncertainty

J Jiang, S Kahai, M Yang - International Journal of Human-Computer …, 2022 - Elsevier
In recent years, AI explainability (XAI) has received wide attention. Although XAI is expected
to play a positive role in decision-making and advice acceptance, various opposing effects …

The intriguing relation between counterfactual explanations and adversarial examples

T Freiesleben - Minds and Machines, 2022 - Springer
The same method that creates adversarial examples (AEs) to fool image-classifiers can be
used to generate counterfactual explanations (CEs) that explain algorithmic decisions. This …

Robust feature-level adversaries are interpretability tools

S Casper, M Nadeau… - Advances in Neural …, 2022 - proceedings.neurips.cc
The literature on adversarial attacks in computer vision typically focuses on pixel-level
perturbations. These tend to be very difficult to interpret. Recent work that manipulates the …

Pay attention to your loss: understanding misconceptions about lipschitz neural networks

L Béthune, T Boissin, M Serrurier… - Advances in …, 2022 - proceedings.neurips.cc
Lipschitz constrained networks have gathered considerable attention in the deep learning
community, with usages ranging from Wasserstein distance estimation to the training of …

[图书][B] Adversarial Machine Learning: Attack Surfaces, Defence Mechanisms, Learning Theories in Artificial Intelligence

AS Chivukula, X Yang, B Liu, W Liu, W Zhou - 2023 - Springer
A significant robustness gap exists between machine intelligence and human perception
despite recent advances in deep learning. Deep learning is not provably secure. A critical …

Diagnostics for deep neural networks with automated copy/paste attacks

S Casper, K Hariharan, D Hadfield-Menell - arXiv preprint arXiv …, 2022 - arxiv.org
This paper considers the problem of helping humans exercise scalable oversight over deep
neural networks (DNNs). Adversarial examples can be useful by helping to reveal …