Needle in a haystack: An analysis of high-agreement workers on mturk for summarization

L Zhang, S Mille, Y Hou, D Deutsch, E Clark… - arXiv preprint arXiv …, 2022 - arxiv.org
L Zhang, S Mille, Y Hou, D Deutsch, E Clark, Y Liu, S Mahamood, S Gehrmann, M Clinciu
arXiv preprint arXiv:2212.10397, 2022arxiv.org
To prevent the costly and inefficient use of resources on low-quality annotations, we want a
method for creating a pool of dependable annotators who can effectively complete difficult
tasks, such as evaluating automatic summarization. Thus, we investigate the recruitment of
high-quality Amazon Mechanical Turk workers via a two-step pipeline. We show that we can
successfully filter out subpar workers before they carry out the evaluations and obtain high-
agreement annotations with similar constraints on resources. Although our workers …
To prevent the costly and inefficient use of resources on low-quality annotations, we want a method for creating a pool of dependable annotators who can effectively complete difficult tasks, such as evaluating automatic summarization. Thus, we investigate the recruitment of high-quality Amazon Mechanical Turk workers via a two-step pipeline. We show that we can successfully filter out subpar workers before they carry out the evaluations and obtain high-agreement annotations with similar constraints on resources. Although our workers demonstrate a strong consensus among themselves and CloudResearch workers, their alignment with expert judgments on a subset of the data is not as expected and needs further training in correctness. This paper still serves as a best practice for the recruitment of qualified annotators in other challenging annotation tasks.
arxiv.org
以上显示的是最相近的搜索结果。 查看全部搜索结果