A survey on programmatic weak supervision

J Zhang, CY Hsieh, Y Yu, C Zhang, A Ratner - arXiv preprint arXiv …, 2022 - arxiv.org
Labeling training data has become one of the major roadblocks to using machine learning.
Among various weak supervision paradigms, programmatic weak supervision (PWS) has …

Nemo: Guiding and contextualizing weak supervision for interactive data programming

CY Hsieh, J Zhang, A Ratner - arXiv preprint arXiv:2203.01382, 2022 - arxiv.org
Weak Supervision (WS) techniques allow users to efficiently create large training datasets
by programmatically labeling data with heuristic sources of supervision. While the success of …

Towards learned metadata extraction for data lakes

S Langenecker, C Sturm, C Schalles, C Binnig - 2021 - dl.gi.de
An important task for enabling the efficient exploration of available data in a data lake is to
annotate semantic type information to the available data sources. In order to reduce the …

Witan: unsupervised labelling function generation for assisted data programming

B Denham, EMK Lai, R Sinha, MA Naeem - Proceedings of the VLDB …, 2022 - dl.acm.org
Effective supervised training of modern machine learning models often requires large
labelled training datasets, which could be prohibitively costly to acquire for many practical …

Steered Training Data Generation for Learned Semantic Type Detection

S Langenecker, C Sturm, CS Schalles… - Proceedings of the ACM …, 2023 - dl.acm.org
In this paper, we introduce STEER to adapt learned semantic type extraction approaches to
a new, unseen data lake. STEER provides a data programming framework for semantic …

A Weakly Supervised Data Labeling Framework for Machine Lexical Normalization in Vietnamese Social Media

DH Nguyen, ATH Nguyen, K Van Nguyen - Cognitive Computation, 2025 - Springer
This study introduces an innovative automatic labeling framework to address the challenges
of lexical normalization in social media texts for low-resource languages like Vietnamese …

A Study on Reducing Big Data Image Annotation Burden Through Iterative Expert-In-The-Loop Strategy

E Mahmoodi, Z Xue, S Rajaraman… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
A key challenge in development of reliable and robust medical imaging machine learning
solution is the lack of annotated data. This problem becomes particularly significant when …

[PDF][PDF] Tools of trade of the next blue-collar job? antecedents, design features, and outcomes of interactive labeling systems

M Knaeble, M Nadj, L Germann, A Maedche - 2023 - core.ac.uk
Supervised machine learning is becoming increasingly popular-and so is the need for
annotated training data. Such data often needs to be manually labeled by human workers …

Interpretable and Effortless Techniques for Social Network Analysis

MF Aparicio - 2023 - digibug.ugr.es
Social Networking Sites (SNS) are the most important way of communication nowadays.
They have changed how we interact with our friends and family, and even how companies …

A Methodology to Quickly Perform Opinion Mining and Build Supervised Datasets Using Social Networks Mechanics

M Francisco, JL Castro - IEEE Transactions on Knowledge and …, 2023 - ieeexplore.ieee.org
Social Networking Sites (SNS) offer a full set of possibilities to perform opinion studies such
as polling or market analysis. Normally, artificial intelligence techniques are applied, and …