Multimodal datasets: misogyny, pornography, and malignant stereotypes

A Birhane, VU Prabhu, E Kahembwe - arXiv preprint arXiv:2110.01963, 2021 - arxiv.org
We have now entered the era of trillion parameter machine learning models trained on
billion-sized datasets scraped from the internet. The rise of these gargantuan datasets has …

Standardizing reporting of participant compensation in HCI: A systematic literature review and recommendations for the field

J Pater, A Coupe, R Pfafman, C Phelan… - Proceedings of the …, 2021 - dl.acm.org
The user study is a fundamental method used in HCI. In designing user studies, we often
use compensation strategies to incentivize recruitment. However, compensation can also …

Screen recognition: Creating accessibility metadata for mobile applications from pixels

X Zhang, L De Greef, A Swearngin, S White… - Proceedings of the …, 2021 - dl.acm.org
Many accessibility features available on mobile platforms require applications (apps) to
provide complete and accurate metadata describing user interface (UI) components …

Captioning images taken by people who are blind

D Gurari, Y Zhao, M Zhang, N Bhattacharya - Computer Vision–ECCV …, 2020 - Springer
While an important problem in the vision community is to design algorithms that can
automatically caption images, few publicly-available datasets for algorithm development …

“It's complicated”: Negotiating accessibility and (mis) representation in image descriptions of race, gender, and disability

CL Bennett, C Gleason, MK Scheuerman… - Proceedings of the …, 2021 - dl.acm.org
Content creators are instructed to write textual descriptions of visual content to make it
accessible; yet existing guidelines lack specifics on how to write about people's appearance …

" Person, Shoes, Tree. Is the Person Naked?" What People with Vision Impairments Want in Image Descriptions

A Stangl, MR Morris, D Gurari - Proceedings of the 2020 chi conference …, 2020 - dl.acm.org
Access to digital images is important to people who are blind or have low vision (BLV). Many
contemporary image description efforts do not take into account this population's nuanced …

Going beyond one-size-fits-all image descriptions to satisfy the information wants of people who are blind or have low vision

A Stangl, N Verma, KR Fleischmann… - Proceedings of the 23rd …, 2021 - dl.acm.org
Image descriptions are how people who are blind or have low vision (BLV) access
information depicted within images. To our knowledge, no prior work has examined how a …

ImageExplorer: Multi-layered touch exploration to encourage skepticism towards imperfect AI-generated image captions

J Lee, J Herskovitz, YH Peng, A Guo - … of the 2022 CHI Conference on …, 2022 - dl.acm.org
Blind users rely on alternative text (alt-text) to understand an image; however, alt-text is often
missing. AI-generated captions are a more scalable alternative, but they often miss crucial …

Context-VQA: Towards context-aware and purposeful visual question answering

N Naik, C Potts, E Kreiss - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Visual question answering (VQA) has the potential to make the Internet more accessible in
an interactive way, allowing people who cannot see images to ask questions about them …

Widget captioning: Generating natural language description for mobile user interface elements

Y Li, G Li, L He, J Zheng, H Li, Z Guan - arXiv preprint arXiv:2010.04295, 2020 - arxiv.org
Natural language descriptions of user interface (UI) elements such as alternative text are
crucial for accessibility and language-based interaction in general. Yet, these descriptions …