Advances, challenges and opportunities in creating data for trustworthy AI

W Liang, GA Tadesse, D Ho, L Fei-Fei… - Nature Machine …, 2022 - nature.com
As artificial intelligence (AI) transitions from research to deployment, creating the appropriate
datasets and data pipelines to develop and evaluate AI models is increasingly the biggest …

Fair ranking: a critical review, challenges, and future directions

GK Patro, L Porcaro, L Mitchell, Q Zhang… - Proceedings of the …, 2022 - dl.acm.org
Ranking, recommendation, and retrieval systems are widely used in online platforms and
other societal systems, including e-commerce, media-streaming, admissions, gig platforms …

Do datasets have politics? Disciplinary values in computer vision dataset development

MK Scheuerman, A Hanna, E Denton - … of the ACM on Human-Computer …, 2021 - dl.acm.org
Data is a crucial component of machine learning. The field is reliant on data to train, validate,
and test models. With increased technical capabilities, machine learning research has …

Facet: Fairness in computer vision evaluation benchmark

L Gustafson, C Rolland, N Ravi… - Proceedings of the …, 2023 - openaccess.thecvf.com
Computer vision models have known performance disparities across attributes such as
gender and skin tone. This means during tasks such as classification and detection, model …

Studying up machine learning data: Why talk about bias when we mean power?

M Miceli, J Posada, T Yang - Proceedings of the ACM on Human …, 2022 - dl.acm.org
Research in machine learning (ML) has argued that models trained on incomplete or biased
datasets can lead to discriminatory outputs. In this commentary, we propose moving the …

The data-production dispositif

M Miceli, J Posada - Proceedings of the ACM on human-computer …, 2022 - dl.acm.org
Machine learning (ML) depends on data to train and verify models. Very often, organizations
outsource processes related to data work (ie, generating and annotating data and …

A survey on bias in visual datasets

S Fabbrizzi, S Papadopoulos, E Ntoutsi… - Computer Vision and …, 2022 - Elsevier
Computer Vision (CV) has achieved remarkable results, outperforming humans in several
tasks. Nonetheless, it may result in significant discrimination if not handled properly. Indeed …

A hunt for the snark: Annotator diversity in data practices

S Kapania, AS Taylor, D Wang - … of the 2023 CHI Conference on Human …, 2023 - dl.acm.org
Diversity in datasets is a key component to building responsible AI/ML. Despite this
recognition, we know little about the diversity among the annotators involved in data …

On responsible machine learning datasets emphasizing fairness, privacy and regulatory norms with examples in biometrics and healthcare

S Mittal, K Thakral, R Singh, M Vatsa, T Glaser… - Nature Machine …, 2024 - nature.com
Artificial Intelligence (AI) has seamlessly integrated into numerous scientific domains,
catalysing unparalleled enhancements across a broad spectrum of tasks; however, its …

Understanding machine learning practitioners' data documentation perceptions, needs, challenges, and desiderata

AK Heger, LB Marquis, M Vorvoreanu… - Proceedings of the …, 2022 - dl.acm.org
Data is central to the development and evaluation of machine learning (ML) models.
However, the use of problematic or inappropriate datasets can result in harms when the …