What does it mean for a language model to preserve privacy?

H Brown, K Lee, F Mireshghallah, R Shokri… - Proceedings of the 2022 …, 2022 - dl.acm.org
Natural language reflects our private lives and identities, making its privacy concerns as
broad as those of real life. Language models lack the ability to understand the context and …

Flocks of stochastic parrots: Differentially private prompt learning for large language models

H Duan, A Dziedzic, N Papernot… - Advances in Neural …, 2024 - proceedings.neurips.cc
Large language models (LLMs) are excellent in-context learners. However, the sensitivity of
data contained in prompts raises privacy concerns. Our work first shows that these concerns …

Detecting pretraining data from large language models

W Shi, A Ajith, M Xia, Y Huang, D Liu, T Blevins… - arXiv preprint arXiv …, 2023 - arxiv.org
Although large language models (LLMs) are widely deployed, the data used to train them is
rarely disclosed. Given the incredible scale of this data, up to trillions of tokens, it is all but …

Quantifying privacy risks of masked language models using membership inference attacks

F Mireshghallah, K Goyal, A Uniyal… - arXiv preprint arXiv …, 2022 - arxiv.org
The wide adoption and application of Masked language models~(MLMs) on sensitive data
(from legal to medical) necessitates a thorough quantitative investigation into their privacy …

[PDF][PDF] On the privacy risk of in-context learning

H Duan, A Dziedzic, M Yaghini… - The 61st Annual …, 2023 - adam-dziedzic.com
Large language models (LLMs) are excellent few-shot learners. They can perform a wide
variety of tasks purely based on natural language prompts provided to them. These prompts …

Security and privacy challenges of large language models: A survey

BC Das, MH Amini, Y Wu - arXiv preprint arXiv:2402.00888, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …

Identifying and mitigating privacy risks stemming from language models: A survey

V Smith, AS Shamsabadi, C Ashurst… - arXiv preprint arXiv …, 2023 - arxiv.org
Rapid advancements in language models (LMs) have led to their adoption across many
sectors. Alongside the potential benefits, such models present a range of risks, including …

Subject membership inference attacks in federated learning

A Suri, P Kanani, VJ Marathe, DW Peterson - arXiv preprint arXiv …, 2022 - arxiv.org
Privacy attacks on Machine Learning (ML) models often focus on inferring the existence of
particular data points in the training data. However, what the adversary really wants to know …

Do not give away my secrets: Uncovering the privacy issue of neural code completion tools

Y Huang, Y Li, W Wu, J Zhang, MR Lyu - arXiv preprint arXiv:2309.07639, 2023 - arxiv.org
Neural Code Completion Tools (NCCTs) have reshaped the field of software development,
which accurately suggest contextually-relevant code snippets benefiting from language …

Did the neurons read your book? document-level membership inference for large language models

M Meeus, S Jain, M Rei, YA de Montjoye - arXiv preprint arXiv:2310.15007, 2023 - arxiv.org
With large language models (LLMs) poised to become embedded in our daily lives,
questions are starting to be raised about the dataset (s) they learned from. These questions …