Preserving privacy in large language models: A survey on current threats and solutions

M Miranda, ES Ruzzetti, A Santilli, FM Zanzotto… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) represent a significant advancement in artificial
intelligence, finding applications across various domains. However, their reliance on …

Private distribution learning with public data: The view from sample compression

S Ben-David, A Bie, CL Canonne… - Advances in …, 2023 - proceedings.neurips.cc
We study the problem of private distribution learning with access to public data. In this setup,
which we refer to as* public-private learning*, the learner is given public and private …

Can Public Large Language Models Help Private Cross-device Federated Learning?

B Wang, YJ Zhang, Y Cao, B Li, HB McMahan… - arXiv preprint arXiv …, 2023 - arxiv.org
We study (differentially) private federated learning (FL) of language models. The language
models in cross-device FL are relatively small, which can be trained with meaningful formal …

Differentially private image classification by learning priors from random processes

X Tang, A Panda, V Sehwag… - Advances in Neural …, 2024 - proceedings.neurips.cc
In privacy-preserving machine learning, differentially private stochastic gradient descent (DP-
SGD) performs worse than SGD due to per-sample gradient clipping and noise addition. A …

Vip: A differentially private foundation model for computer vision

Y Yu, M Sanjabi, Y Ma, K Chaudhuri, C Guo - arXiv preprint arXiv …, 2023 - arxiv.org
Artificial intelligence (AI) has seen a tremendous surge in capabilities thanks to the use of
foundation models trained on internet-scale data. On the flip side, the uncurated nature of …

Differentially private synthetic data via foundation model apis 1: Images

Z Lin, S Gopi, J Kulkarni, H Nori, S Yekhanin - arXiv preprint arXiv …, 2023 - arxiv.org
Generating differentially private (DP) synthetic data that closely resembles the original
private data is a scalable way to mitigate privacy concerns in the current data-driven world …

Private learning with public features

W Krichene, NE Mayoraz, S Rendle… - International …, 2024 - proceedings.mlr.press
We study a class of private learning problems in which the data is a join of private and public
features. This is often the case in private personalization tasks such as recommendation or …

Private fine-tuning of large language models with zeroth-order optimization

X Tang, A Panda, M Nasr, S Mahloujifar… - arXiv preprint arXiv …, 2024 - arxiv.org
Fine-tuning large pretrained models on private datasets may run the risk of violating privacy.
Differential privacy is a framework for mitigating privacy risks by enforcing algorithmic …

Unlocking accuracy and fairness in differentially private image classification

L Berrada, S De, JH Shen, J Hayes, R Stanforth… - arXiv preprint arXiv …, 2023 - arxiv.org
Privacy-preserving machine learning aims to train models on private data without leaking
sensitive information. Differential privacy (DP) is considered the gold standard framework for …

Choosing public datasets for private machine learning via gradient subspace distance

X Gu, G Kamath, ZS Wu - arXiv preprint arXiv:2303.01256, 2023 - arxiv.org
Differentially private stochastic gradient descent privatizes model training by injecting noise
into each iteration, where the noise magnitude increases with the number of model …