Dealing with disagreements: Looking beyond the majority vote in subjective annotations

AM Davani, M Díaz, V Prabhakaran - Transactions of the Association …, 2022 - direct.mit.edu
Majority voting and averaging are common approaches used to resolve annotator
disagreements and derive single ground truth labels from multiple annotations. However …

Can large language models transform computational social science?

C Ziems, W Held, O Shaikh, J Chen, Z Zhang… - Computational …, 2024 - direct.mit.edu
Large language models (LLMs) are capable of successfully performing many language
processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify …

Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration

D Card, S Chang, C Becker… - Proceedings of the …, 2022 - National Acad Sciences
We classify and analyze 200,000 US congressional speeches and 5,000 presidential
communications related to immigration from 1880 to the present. Despite the salience of …

Two contrasting data annotation paradigms for subjective NLP tasks

P Röttger, B Vidgen, D Hovy… - arXiv preprint arXiv …, 2021 - arxiv.org
Labelled data is the foundation of most natural language processing tasks. However,
labelling data is difficult and there often are diverse valid beliefs about what the correct data …

On releasing annotator-level labels and information in datasets

V Prabhakaran, AM Davani, M Diaz - arXiv preprint arXiv:2110.05699, 2021 - arxiv.org
A common practice in building NLP datasets, especially using crowd-sourced annotations,
involves obtaining multiple annotator judgements on the same data instances, which are …

ChatClimate: Grounding conversational AI in climate science

SA Vaghefi, D Stammbach, V Muccione… - … Earth & Environment, 2023 - nature.com
Abstract Large Language Models have made remarkable progress in question-answering
tasks, but challenges like hallucination and outdated information persist. These issues are …

A hunt for the snark: Annotator diversity in data practices

S Kapania, AS Taylor, D Wang - … of the 2023 CHI Conference on Human …, 2023 - dl.acm.org
Diversity in datasets is a key component to building responsible AI/ML. Despite this
recognition, we know little about the diversity among the annotators involved in data …

The augmented social scientist: Using sequential transfer learning to annotate millions of texts with human-level accuracy

S Do, É Ollion, R Shen - Sociological Methods & Research, 2024 - journals.sagepub.com
The last decade witnessed a spectacular rise in the volume of available textual data. With
this new abundance came the question of how to analyze it. In the social sciences, scholars …

How (not) to use sociodemographic information for subjective nlp tasks

T Beck, H Schuff, A Lauscher, I Gurevych - arXiv preprint arXiv:2309.07034, 2023 - arxiv.org
Annotators' sociodemographic backgrounds (ie, the individual compositions of their gender,
age, educational background, etc.) have a strong impact on their decisions when working on …

A multi-task model for sentiment aided stance detection of climate change tweets

A Upadhyaya, M Fisichella, W Nejdl - Proceedings of the international …, 2023 - ojs.aaai.org
Climate change has become one of the biggest challenges of our time. Social media
platforms such as Twitter play an important role in raising public awareness and spreading …