Chroniclingamericaqa: A large-scale question answering dataset based on historical american newspaper pages

B Piryani, J Mozafari, A Jatowt - … of the 47th International ACM SIGIR …, 2024 - dl.acm.org
Question answering (QA) and Machine Reading Comprehension (MRC) tasks have
significantly advanced in recent years due to the rapid development of deep learning …

Intersectional bias mitigation in pre-trained language models: A quantum-inspired approach

O Shokrollahi - Proceedings of the 32nd ACM International …, 2023 - dl.acm.org
The growing criticality of contextualized language models has raised concerns about the
perpetuation of biases. Current fairness research often concentrates on single aspects of …

Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI

R Wolfe, A Dangol, A Hiniker, B Howe - … of the AAAI/ACM Conference on …, 2024 - ojs.aaai.org
Multimodal AI models capable of associating images and text hold promise for numerous
domains, ranging from automated image captioning to accessibility applications for blind …

ML-EAT: A Multilevel Embedding Association Test for Interpretable and Transparent Social Science

R Wolfe, A Hiniker, B Howe - Proceedings of the AAAI/ACM Conference …, 2024 - ojs.aaai.org
This research introduces the Multilevel Embedding Association Test (ML-EAT), a method
designed for interpretable and transparent measurement of intrinsic bias in language …

PHD: Pixel-Based Language Modeling of Historical Documents

N Borenstein, P Rust, D Elliott, I Augenstein - arXiv preprint arXiv …, 2023 - arxiv.org
The digitisation of historical documents has provided historians with unprecedented
research opportunities. Yet, the conventional approach to analysing historical documents …

A Multilingual Perspective on Probing Gender Bias

K Stańczak - arXiv preprint arXiv:2403.10699, 2024 - arxiv.org
Gender bias represents a form of systematic negative treatment that targets individuals
based on their gender. This discrimination can range from subtle sexist remarks and …

[PDF][PDF] Toxicity of the Commons: Curating Open-Source Pre-Training Data

C Arnett, E Jones, IP Yamshchikov… - arXiv preprint arXiv …, 2024 - researchgate.net
Open-source large language models are becoming increasingly available and popular
among researchers and practitioners. While significant progress has been made on open …

Fairness in transfer learning for natural language processing

S Goldfarb-Tarrant - 2024 - era.ed.ac.uk
Natural Language Processing (NLP) systems have come to permeate so many areas of
daily life that it is difficult to live a day without having one or many experiences mediated by …