Self-play fine-tuning converts weak language models to strong language models

Z Chen, Y Deng, H Yuan, K Ji, Q Gu - arXiv preprint arXiv:2401.01335, 2024 - arxiv.org
Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is
pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the …

[HTML][HTML] Self-training: A survey

MR Amini, V Feofanov, L Pauletto, L Hadjadj… - Neurocomputing, 2025 - Elsevier
Self-training methods have gained significant attention in recent years due to their
effectiveness in leveraging small labeled datasets and large unlabeled observations for …

Cycle self-training for domain adaptation

H Liu, J Wang, M Long - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Mainstream approaches for unsupervised domain adaptation (UDA) learn domain-invariant
representations to narrow the domain shift, which are empirically effective but theoretically …

Learning with explanation constraints

R Pukdee, D Sam, JZ Kolter… - Advances in …, 2024 - proceedings.neurips.cc
As larger deep learning models are hard to interpret, there has been a recent focus on
generating explanations of these black-box models. In contrast, we may have apriori …

Can semi-supervised learning use all the data effectively? A lower bound perspective

A Tifrea, G Yüce, A Sanyal… - Advances in Neural …, 2024 - proceedings.neurips.cc
Prior theoretical and empirical works have established that semi-supervised learning
algorithms can leverage the unlabeled data to improve over the labeled sample complexity …

Knowledge distillation: Bad models can be good role models

G Kaplun, E Malach, P Nakkiran… - Advances in Neural …, 2022 - proceedings.neurips.cc
Large neural networks trained in the overparameterized regime are able to fit noise to zero
train error. Recent work of Nakkiran and Bansal has empirically observed that such networks …

[PDF][PDF] How Does Semi-supervised learing with Pseudo-labelers Work? A Case Study

Y Kou, Z Chen, Y Cao, Q Gu - International Conference on Learning …, 2023 - par.nsf.gov
Semi-supervised learning is a popular machine learning paradigm that utilizes a large
amount of unlabeled data as well as a small amount of labeled data to facilitate learning …

Theoretical Analysis of Weak-to-Strong Generalization

H Lang, D Sontag, A Vijayaraghavan - arXiv preprint arXiv:2405.16043, 2024 - arxiv.org
Strong student models can learn from weaker teachers: when trained on the predictions of a
weaker model, a strong pretrained student can learn to correct the weak model's errors and …

[PDF][PDF] Generalization Guarantees of Self-Training of Halfspaces under Label Noise Corruption.

L Hadjadj, MR Amini, S Louhichi - IJCAI, 2023 - ijcai.org
We investigate the generalization properties of a self-training algorithm with halfspaces. The
approach learns a list of halfspaces iteratively from labeled and unlabeled training data, in …

Multi-class Probabilistic Bounds for Majority Vote Classifiers with Partially Labeled Data

V Feofanov, E Devijver, MR Amini - Journal of Machine Learning Research, 2024 - jmlr.org
In this paper, we propose a probabilistic framework for analyzing a multi-class majority vote
classifier in the case where training data is partially labeled. First, we derive a multi-class …