Power to the people? Opportunities and challenges for participatory AI

A Birhane, W Isaac, V Prabhakaran, M Diaz… - Proceedings of the 2nd …, 2022 - dl.acm.org
Participatory approaches to artificial intelligence (AI) and machine learning (ML) are gaining
momentum: the increased attention comes partly with the view that participation opens the …

Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text

S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org
Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …

[PDF][PDF] Scaling autoregressive models for content-rich text-to-image generation

J Yu, Y Xu, JY Koh, T Luong, G Baid, Z Wang… - arXiv preprint arXiv …, 2022 - 3dvar.com
Abstract We present the Pathways [1] Autoregressive Text-to-Image (Parti) model, which
generates high-fidelity photorealistic images and supports content-rich synthesis involving …

Taxonomy of risks posed by language models

L Weidinger, J Uesato, M Rauh, C Griffin… - Proceedings of the …, 2022 - dl.acm.org
Responsible innovation on large-scale Language Models (LMs) requires foresight into and
in-depth understanding of the risks these models may pose. This paper develops a …

Ethical and social risks of harm from language models

L Weidinger, J Mellor, M Rauh, C Griffin… - arXiv preprint arXiv …, 2021 - arxiv.org
This paper aims to help structure the risk landscape associated with large-scale Language
Models (LMs). In order to foster advances in responsible innovation, an in-depth …

Fake it till you make it: Learning transferable representations from synthetic imagenet clones

MB Sarıyıldız, K Alahari, D Larlus… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …

“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI

N Sambasivan, S Kapania, H Highfill… - proceedings of the …, 2021 - dl.acm.org
AI models are increasingly applied in high-stakes domains like health and conservation.
Data quality carries an elevated significance in high-stakes AI due to its heightened …

Documenting large webtext corpora: A case study on the colossal clean crawled corpus

J Dodge, M Sap, A Marasović, W Agnew… - arXiv preprint arXiv …, 2021 - arxiv.org
Large language models have led to remarkable progress on many NLP tasks, and
researchers are turning to ever-larger text corpora to train them. Some of the largest corpora …

Five sources of bias in natural language processing

D Hovy, S Prabhumoye - Language and linguistics compass, 2021 - Wiley Online Library
Recently, there has been an increased interest in demographically grounded bias in natural
language processing (NLP) applications. Much of the recent work has focused on describing …

Data cards: Purposeful and transparent dataset documentation for responsible ai

M Pushkarna, A Zaldivar, O Kjartansson - Proceedings of the 2022 ACM …, 2022 - dl.acm.org
As research and industry moves towards large-scale models capable of numerous
downstream tasks, the complexity of understanding multi-modal datasets that give nuance to …