Trustworthy AI: From principles to practices

B Li, P Qi, B Liu, S Di, J Liu, J Pei, J Yi… - ACM Computing Surveys, 2023 - dl.acm.org
The rapid development of Artificial Intelligence (AI) technology has enabled the deployment
of various systems based on it. However, many current AI systems are found vulnerable to …

A survey of adversarial defenses and robustness in nlp

S Goyal, S Doddapaneni, MM Khapra… - ACM Computing …, 2023 - dl.acm.org
In the past few years, it has become increasingly evident that deep neural networks are not
resilient enough to withstand adversarial perturbations in input data, leaving them …

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G Jin, Y Dong… - Artificial Intelligence …, 2024 - Springer
Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Simple and principled uncertainty estimation with deterministic deep learning via distance awareness

J Liu, Z Lin, S Padhy, D Tran… - Advances in neural …, 2020 - proceedings.neurips.cc
Bayesian neural networks (BNN) and deep ensembles are principled approaches to
estimate the predictive uncertainty of a deep learning model. However their practicality in …

Certified adversarial robustness via randomized smoothing

J Cohen, E Rosenfeld, Z Kolter - international conference on …, 2019 - proceedings.mlr.press
We show how to turn any classifier that classifies well under Gaussian noise into a new
classifier that is certifiably robust to adversarial perturbations under the L2 norm. While this" …

[HTML][HTML] Adversarial attacks and defenses in deep learning

K Ren, T Zheng, Z Qin, X Liu - Engineering, 2020 - Elsevier
With the rapid developments of artificial intelligence (AI) and deep learning (DL) techniques,
it is critical to ensure the security and robustness of the deployed algorithms. Recently, the …

Attacks which do not kill training make adversarial learning stronger

J Zhang, X Xu, B Han, G Niu, L Cui… - International …, 2020 - proceedings.mlr.press
Adversarial training based on the minimax formulation is necessary for obtaining adversarial
robustness of trained models. However, it is conservative or even pessimistic so that it …

On mean absolute error for deep neural network based vector-to-vector regression

J Qi, J Du, SM Siniscalchi, X Ma… - IEEE Signal Processing …, 2020 - ieeexplore.ieee.org
In this paper, we exploit the properties of mean absolute error (MAE) as a loss function for
the deep neural network (DNN) based vector-to-vector regression. The goal of this work is …

Efficient neural network robustness certification with general activation functions

H Zhang, TW Weng, PY Chen… - Advances in neural …, 2018 - proceedings.neurips.cc
Finding minimum distortion of adversarial examples and thus certifying robustness in neural
networks classifiers is known to be a challenging problem. Nevertheless, recently it has …

Efficient and accurate estimation of lipschitz constants for deep neural networks

M Fazlyab, A Robey, H Hassani… - Advances in neural …, 2019 - proceedings.neurips.cc
Tight estimation of the Lipschitz constant for deep neural networks (DNNs) is useful in many
applications ranging from robustness certification of classifiers to stability analysis of closed …