Although large language models (LLMs) are widely deployed, the data used to train them is rarely disclosed. Given the incredible scale of this data, up to trillions of tokens, it is all but …
Retrieval-based language models (LMs) have demonstrated improved interpretability, factuality, and adaptability compared to their parametric counterparts, by incorporating …
Cloud-based machine learning inference is an emerging paradigm where users query by sending their data through a service provider who runs an ML model on that data and …
With the advancement of language models (LMs), their exposure to private data is increasingly inevitable, and their deployment (especially for smaller ones) on personal …
W Krichene, NE Mayoraz, S Rendle… - International …, 2024 - proceedings.mlr.press
We study a class of private learning problems in which the data is a join of private and public features. This is often the case in private personalization tasks such as recommendation or …
We consider the problem of training private recommendation models with access to public item features. Training with Differential Privacy (DP) offers strong privacy guarantees, at the …
We propose a novel and practical privacy notion called $ f $-Membership Inference Privacy ($ f $-MIP), which explicitly considers the capabilities of realistic adversaries under the …
Quantum statistical queries provide a theoretical framework for investigating the computational power of a learner with limited quantum resources. This model is particularly …
Generative models trained with Differential Privacy (DP) are increasingly used to produce synthetic data while reducing privacy risks. Navigating their specific privacy-utility tradeoffs …