Why is public pretraining necessary for private model training?

A Ganesh, M Haghifam, M Nasr, S Oh… - International …, 2023 - proceedings.mlr.press
In the privacy-utility tradeoff of a model trained on benchmark language and vision tasks,
remarkable improvements have been widely reported when the model is pretrained on …

Automatic clipping: Differentially private deep learning made easier and stronger

Z Bu, YX Wang, S Zha… - Advances in Neural …, 2024 - proceedings.neurips.cc
Per-example gradient clipping is a key algorithmic step that enables practical differential
private (DP) training for deep learning models. The choice of clipping threshold $ R …

Training data extraction from pre-trained language models: A survey

S Ishihara - arXiv preprint arXiv:2305.16157, 2023 - arxiv.org
As the deployment of pre-trained language models (PLMs) expands, pressing security
concerns have arisen regarding the potential for malicious extraction of training data, posing …

Correlated noise provably beats independent noise for differentially private learning

CA Choquette-Choo, K Dvijotham, K Pillutla… - arXiv preprint arXiv …, 2023 - arxiv.org
Differentially private learning algorithms inject noise into the learning process. While the
most common private learning algorithm, DP-SGD, adds independent Gaussian noise in …

Differentially private image classification by learning priors from random processes

X Tang, A Panda, V Sehwag… - Advances in Neural …, 2024 - proceedings.neurips.cc
In privacy-preserving machine learning, differentially private stochastic gradient descent (DP-
SGD) performs worse than SGD due to per-sample gradient clipping and noise addition. A …

Privacy-preserving in-context learning for large language models

T Wu, A Panda, JT Wang, P Mittal - arXiv preprint arXiv:2305.01639, 2023 - arxiv.org
In-context learning (ICL) is an important capability of Large Language Models (LLMs),
enabling these models to dynamically adapt based on specific, in-context exemplars …

Differentially private synthetic data via foundation model apis 1: Images

Z Lin, S Gopi, J Kulkarni, H Nori, S Yekhanin - arXiv preprint arXiv …, 2023 - arxiv.org
Generating differentially private (DP) synthetic data that closely resembles the original
private data is a scalable way to mitigate privacy concerns in the current data-driven world …

Private fine-tuning of large language models with zeroth-order optimization

X Tang, A Panda, M Nasr, S Mahloujifar… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-tuning large pretrained models on private datasets may run the risk of violating privacy.
Differential privacy is a framework for mitigating privacy risks by enforcing algorithmic …

Unlocking accuracy and fairness in differentially private image classification

L Berrada, S De, JH Shen, J Hayes, R Stanforth… - arXiv preprint arXiv …, 2023 - arxiv.org
Privacy-preserving machine learning aims to train models on private data without leaking
sensitive information. Differential privacy (DP) is considered the gold standard framework for …

Optimal Unbiased Randomizers for Regression with Label Differential Privacy

A Badanidiyuru Varadaraja, B Ghazi… - Advances in …, 2024 - proceedings.neurips.cc
We propose a new family of label randomizers for training regression models under the
constraint of label differential privacy (DP). In particular, we leverage the trade-offs between …