Speech datasets are crucial for training Speech Language Technologies (SLT); however, the lack of diversity of the underlying training data can lead to serious limitations in building …
Content Warning: This paper contains examples of misgendering and erasure that could be offensive and potentially triggering. Gender bias in language technologies has been widely …
Abstract Machine learning (ML) systems in natural language processing (NLP) face significant challenges in generalizing to out-of-distribution (OOD) data, where the test …
Gender inequality is embedded in our communication practices and perpetuated in translation technologies. This becomes particularly apparent when translating into …
Natural language generation models reproduce and often amplify the biases present in their training data. Previous research explored using sequence-to-sequence rewriting models to …
As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased. In particular, gender …
J Vamvas, R Sennrich - 2021 Conference on Empirical Methods …, 2021 - research.ed.ac.uk
Lexical disambiguation is a major challenge for machine translation systems, especially if some senses of a word are trained less often than others. Identifying patterns of …
As part of the WMT-2023" Test suites" shared task, in this paper we summarize the results of two test suites evaluations: MuST-SHE-WMT23 and INES. By focusing on the en-de and de …
Gender-neutral translation (GNT) that avoids biased and undue binary assumptions is a pivotal challenge for the creation of more inclusive translation technologies. Advancements …