Large language models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify …
D Card, S Chang, C Becker… - Proceedings of the …, 2022 - National Acad Sciences
We classify and analyze 200,000 US congressional speeches and 5,000 presidential communications related to immigration from 1880 to the present. Despite the salience of …
Labelled data is the foundation of most natural language processing tasks. However, labelling data is difficult and there often are diverse valid beliefs about what the correct data …
A common practice in building NLP datasets, especially using crowd-sourced annotations, involves obtaining multiple annotator judgements on the same data instances, which are …
Abstract Large Language Models have made remarkable progress in question-answering tasks, but challenges like hallucination and outdated information persist. These issues are …
Diversity in datasets is a key component to building responsible AI/ML. Despite this recognition, we know little about the diversity among the annotators involved in data …
S Do, É Ollion, R Shen - Sociological Methods & Research, 2024 - journals.sagepub.com
The last decade witnessed a spectacular rise in the volume of available textual data. With this new abundance came the question of how to analyze it. In the social sciences, scholars …
Annotators' sociodemographic backgrounds (ie, the individual compositions of their gender, age, educational background, etc.) have a strong impact on their decisions when working on …
Climate change has become one of the biggest challenges of our time. Social media platforms such as Twitter play an important role in raising public awareness and spreading …