Towards trustworthy and aligned machine learning: A data-centric survey with causality perspectives

H Liu, M Chaudhary, H Wang - arXiv preprint arXiv:2307.16851, 2023 - arxiv.org
The trustworthiness of machine learning has emerged as a critical topic in the field,
encompassing various applications and research areas such as robustness, security …

Technologies for trustworthy machine learning: A survey in a socio-technical context

E Toreini, M Aitken, KPL Coopamootoo, K Elliott… - arXiv preprint arXiv …, 2020 - arxiv.org
Concerns about the societal impact of AI-based services and systems has encouraged
governments and other organisations around the world to propose AI policy frameworks to …

Trustworthy machine learning and artificial intelligence

KR Varshney - XRDS: Crossroads, The ACM Magazine for Students, 2019 - dl.acm.org
Trustworthy machine learning and artificial intelligence Page 1 26 feature XRDS • SPRING
2019 • VOL.25 • NO.3 Math Destruction by Cathy O’Neil, catalogs numerous examples of …

Overfitting in adversarially robust deep learning

L Rice, E Wong, Z Kolter - International conference on …, 2020 - proceedings.mlr.press
It is common practice in deep learning to use overparameterized networks and train for as
long as possible; there are numerous studies that show, both theoretically and empirically …

A framework quantifying trustworthiness of supervised machine and deep learning models

A Huertas Celdran, J Kreischer, M Demirci… - CEUR Workshop …, 2023 - zora.uzh.ch
Trusting Artificial Intelligence (AI) is controversial since models and predictions might not be
fair, understandable by humans, robust against adversaries, or trained appropriately …

Towards better understanding of training certifiably robust models against adversarial examples

S Lee, W Lee, J Park, J Lee - Advances in Neural …, 2021 - proceedings.neurips.cc
We study the problem of training certifiably robust models against adversarial examples.
Certifiable training minimizes an upper bound on the worst-case loss over the allowed …

Adversarial Robustness Toolbox v1. 0.0

MI Nicolae, M Sinn, MN Tran, B Buesser… - arXiv preprint arXiv …, 2018 - arxiv.org
Adversarial Robustness Toolbox (ART) is a Python library supporting developers and
researchers in defending Machine Learning models (Deep Neural Networks, Gradient …

Making machine learning trustworthy

B Eshete - Science, 2021 - science.org
Machine learning (ML) has advanced dramatically during the past decade and continues to
achieve impressive human-level performance on nontrivial tasks in image, speech, and text …

Diversify your datasets: Analyzing generalization via controlled variance in adversarial datasets

O Rozen, V Shwartz, R Aharoni, I Dagan - arXiv preprint arXiv:1910.09302, 2019 - arxiv.org
Phenomenon-specific" adversarial" datasets have been recently designed to perform
targeted stress-tests for particular inference types. Recent work (Liu et al., 2019a) proposed …

Towards an adversarially robust normalization approach

M Awais, F Shamshad, SH Bae - arXiv preprint arXiv:2006.11007, 2020 - arxiv.org
Batch Normalization (BatchNorm) is effective for improving the performance and
accelerating the training of deep neural networks. However, it has also shown to be a cause …