Seegull multilingual: a dataset of geo-culturally situated stereotypes

M Bhutani, K Robinson, V Prabhakaran, S Dave… - arXiv preprint arXiv …, 2024 - arxiv.org
While generative multilingual models are rapidly being deployed, their safety and fairness
evaluations are largely limited to resources collected in English. This is especially …

Current State-of-the-Art of Bias Detection and Mitigation in Machine Translation for African and European Languages: a Review

C Ikae, M Kurpicz-Briki - arXiv preprint arXiv:2410.21126, 2024 - arxiv.org
Studying bias detection and mitigation methods in natural language processing and the
particular case of machine translation is highly relevant, as societal stereotypes might be …

The GenderQueer test suite

SR Friidhriksdóttir - Proceedings of the Ninth Conference on …, 2024 - aclanthology.org
This paper introduces the GenderQueer Test Suite, an evaluation set for assessing machine
translation (MT) systems' capabilities in handling gender-diverse and queer-inclusive …

[PDF][PDF] Ordbog over moderne islandsk–udvikling og tilføjelser

EÞ Jóhannsson, Þ Úlfarsdóttir - LexicoNordica, 2024 - tidsskrift.dk
This article examines the lexicographic material found in Dictionary of Contemporary
Icelandic (OMI). We focus on four processes for acquiring new material: 1) corpus data, 2) …

WMT24 Test Suite: Gender Resolution in Speaker-Listener Dialogue Roles

H Dawkins, I Nejadgholi, C Lo - arXiv preprint arXiv:2411.06194, 2024 - arxiv.org
We assess the difficulty of gender resolution in literary-style dialogue settings and the
influence of gender stereotypes. Instances of the test suite contain spoken dialogue …

[PDF][PDF] Representativeness and biases in Icelandic corpora

E Sigurðsson, S Steingrímsson - LexicoNordica, 2024 - tidsskrift.dk
All language data are inherently biased, as collection methods, availability of texts and
recordings, and the views of the collectors will always affect the process and its results. We …

Gendered Grammar or Ingrained Bias? Exploring Gender Bias in Icelandic Language Models

SR Friðriksdóttir, H Einarsson - Proceedings of the 2024 Joint …, 2024 - aclanthology.org
Large language models, trained on vast datasets, exhibit increased output quality in
proportion to the amount of data that is used to train them. This data-driven learning process …