Differentially private fine-tuning of language models

D Yu, S Naik, A Backurs, S Gopi, HA Inan… - arXiv preprint arXiv …, 2021 - arxiv.org
We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-
scale pre-trained language models, which achieve the state-of-the-art privacy versus utility …

Why is public pretraining necessary for private model training?

A Ganesh, M Haghifam, M Nasr, S Oh… - International …, 2023 - proceedings.mlr.press
In the privacy-utility tradeoff of a model trained on benchmark language and vision tasks,
remarkable improvements have been widely reported when the model is pretrained on …

Reasoning about generalization via conditional mutual information

T Steinke, L Zakynthinou - Conference on Learning Theory, 2020 - proceedings.mlr.press
We provide an information-theoretic framework for studying the generalization properties of
machine learning algorithms. Our framework ties together existing approaches, including …

Do not let privacy overbill utility: Gradient embedding perturbation for private learning

D Yu, H Zhang, W Chen, TY Liu - arXiv preprint arXiv:2102.12677, 2021 - arxiv.org
The privacy leakage of the model about the training data can be bounded in the differential
privacy mechanism. However, for meaningful privacy parameters, a differentially private …

Mixed differential privacy in computer vision

A Golatkar, A Achille, YX Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce AdaMix, an adaptive differentially private algorithm for training deep neural
network classifiers using both private and public image data. While pre-training language …

Bypassing the ambient dimension: Private sgd with gradient subspace identification

Y Zhou, ZS Wu, A Banerjee - arXiv preprint arXiv:2007.03813, 2020 - arxiv.org
Differentially private SGD (DP-SGD) is one of the most popular methods for solving
differentially private empirical risk minimization (ERM). Due to its noisy perturbation on each …

Public data-assisted mirror descent for private model training

E Amid, A Ganesh, R Mathews… - International …, 2022 - proceedings.mlr.press
In this paper, we revisit the problem of using in-distribution public data to improve the
privacy/utility trade-offs for differentially private (DP) model training.(Here, public data refers …

Private distribution learning with public data: The view from sample compression

S Ben-David, A Bie, CL Canonne… - Advances in …, 2023 - proceedings.neurips.cc
We study the problem of private distribution learning with access to public data. In this setup,
which we refer to as* public-private learning*, the learner is given public and private …

Private estimation with public data

A Bie, G Kamath, V Singhal - Advances in neural …, 2022 - proceedings.neurips.cc
We initiate the study of differentially private (DP) estimation with access to a small amount of
public data. For private estimation of $ d $-dimensional Gaussians, we assume that the …

Leveraging public data for practical private query release

T Liu, G Vietri, T Steinke, J Ullman… - … on Machine Learning, 2021 - proceedings.mlr.press
In many statistical problems, incorporating priors can significantly improve performance.
However, the use of prior knowledge in differentially private query release has remained …