Embroid: Unsupervised prediction smoothing can improve few-shot classification

N Guha, M Chen, K Bhatia… - Advances in Neural …, 2023 - proceedings.neurips.cc
Recent work has shown that language models'(LMs) prompt-based learning capabilities
make them well suited for automating data labeling in domains where manual annotation is …

Transferring annotator-and instance-dependent transition matrix for learning from crowds

S Li, X Xia, J Deng, S Gey, T Liu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Learning from crowds describes that the annotations of training data are obtained with
crowd-sourcing services. Multiple annotators each complete their own small part of the …

Automatic calibration and error correction for large language models via pareto optimal self-supervision

T Zhao, M Wei, JS Preston, H Poon - arXiv preprint arXiv:2306.16564, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities out of box for a
wide range of applications, yet accuracy still remains a major growth area, especially in …

Fusing conditional submodular gan and programmatic weak supervision

K Shubham, P Sastry, AP Prathosh - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Programmatic Weak Supervision (PWS) and generative models serve as crucial tools that
enable researchers to maximize the utility of existing datasets without resorting to laborious …

Ground truth inference for weakly supervised entity matching

R Wu, A Bendeck, X Chu, Y He - … of the ACM on Management of Data, 2023 - dl.acm.org
Entity matching (EM) refers to the problem of identifying pairs of data records in one or more
relational tables that refer to the same entity in the real world. Supervised machine learning …

Modelling variability in human annotator simulation

W Wu, W Chen, C Zhang… - Findings of the Association …, 2024 - aclanthology.org
Human annotator simulation (HAS) serves as a cost-effective substitute for human
evaluation tasks such as data annotation and system assessment. It is important to …

It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation

W Wu, W Chen, C Zhang, PC Woodland - arXiv preprint arXiv:2310.00486, 2023 - arxiv.org
Human annotator simulation (HAS) serves as a cost-effective substitute for human
evaluation such as data annotation and system assessment. Human perception and …

Employing label models on ChatGPT answers improves legal text entailment performance

C Nguyen, LM Nguyen - arXiv preprint arXiv:2401.17897, 2024 - arxiv.org
The objective of legal text entailment is to ascertain whether the assertions in a legal query
logically follow from the information provided in one or multiple legal articles. ChatGPT, a …

Mitigating source bias for fairer weak supervision

C Shin, S Cromp, D Adila… - Advances in Neural …, 2024 - proceedings.neurips.cc
Weak supervision enables efficient development of training sets by reducing the need for
ground truth labels. However, the techniques that make weak supervision attractive---such …

How many validation labels do you need? exploring the design space of label-efficient model ranking

Z Hu, J Zhang, Y Yu, Y Zhuang, H Xiong - arXiv preprint arXiv:2312.01619, 2023 - arxiv.org
The paper introduces LEMR, a framework that reduces annotation costs for model selection
tasks. Our approach leverages ensemble methods to generate pseudo-labels, employs …