Data-centric artificial intelligence: A survey

D Zha, ZP Bhat, KH Lai, F Yang, Z Jiang… - ACM Computing …, 2023 - dl.acm.org
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler
of its great success is the availability of abundant and high-quality data for building machine …

Toward causal representation learning

B Schölkopf, F Locatello, S Bauer, NR Ke… - Proceedings of the …, 2021 - ieeexplore.ieee.org
The two fields of machine learning and graphical causality arose and are developed
separately. However, there is, now, cross-pollination and increasing interest in both fields to …

Measuring robustness to natural distribution shifts in image classification

R Taori, A Dave, V Shankar, N Carlini… - Advances in …, 2020 - proceedings.neurips.cc
We study how robust current ImageNet models are to distribution shifts arising from natural
variations in datasets. Most research on robustness focuses on synthetic image …

Self-training with noisy student improves imagenet classification

Q Xie, MT Luong, E Hovy… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
We present a simple self-training method that achieves 88.4% top-1 accuracy on ImageNet,
which is 2.0% better than the state-of-the-art model that requires 3.5 B weakly labeled …

Augmax: Adversarial composition of random augmentations for robust training

H Wang, C Xiao, J Kossaifi, Z Yu… - Advances in neural …, 2021 - proceedings.neurips.cc
Data augmentation is a simple yet effective way to improve the robustness of deep neural
networks (DNNs). Diversity and hardness are two complementary dimensions of data …

Evaluating machine accuracy on imagenet

V Shankar, R Roelofs, H Mania… - International …, 2020 - proceedings.mlr.press
We evaluate a wide range of ImageNet models with five trained human labelers. In our year-
long experiment, trained humans first annotated 40,000 images from the ImageNet and …

Adamatch: A unified approach to semi-supervised learning and domain adaptation

D Berthelot, R Roelofs, K Sohn, N Carlini… - arXiv preprint arXiv …, 2021 - arxiv.org
We extend semi-supervised learning to the problem of domain adaptation to learn
significantly higher-accuracy models that train on one data distribution and test on a different …

Improving robustness without sacrificing accuracy with patch gaussian augmentation

RG Lopes, D Yin, B Poole, J Gilmer… - arXiv preprint arXiv …, 2019 - arxiv.org
Deploying machine learning systems in the real world requires both high accuracy on clean
data and robustness to naturally occurring corruptions. While architectural advances have …

The evolution of out-of-distribution robustness throughout fine-tuning

A Andreassen, Y Bahri, B Neyshabur… - arXiv preprint arXiv …, 2021 - arxiv.org
Although machine learning models typically experience a drop in performance on out-of-
distribution data, accuracies on in-versus out-of-distribution data are widely observed to …

Crossnorm and selfnorm for generalization under distribution shifts

Z Tang, Y Gao, Y Zhu, Z Zhang, M Li… - Proceedings of the …, 2021 - openaccess.thecvf.com
Traditional normalization techniques (eg, Batch Normalization and Instance Normalization)
generally and simplistically assume that training and test data follow the same distribution …