Rethinking machine unlearning for large language models

S Liu, Y Yao, J Jia, S Casper, N Baracaldo… - arXiv preprint arXiv …, 2024 - arxiv.org
We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …

Soul: Unlocking the power of second-order optimization for llm unlearning

J Jia, Y Zhang, Y Zhang, J Liu, B Runwal… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have highlighted the necessity of effective unlearning
mechanisms to comply with data regulations and ethical AI practices. LLM unlearning aims …

Scissorhands: Scrub data influence via connection sensitivity in networks

J Wu, M Harandi - European Conference on Computer Vision, 2025 - Springer
Abstract Machine unlearning has become a pivotal task to erase the influence of data from a
trained model. It adheres to recent data regulation standards and enhances the privacy and …

CURE4Rec: A benchmark for recommendation unlearning with deeper influence

C Chen, J Zhang, Y Zhang, L Zhang, L Lyu, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
With increasing privacy concerns in artificial intelligence, regulations have mandated the
right to be forgotten, granting individuals the right to withdraw their data from models …

WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models

J Jia, J Liu, Y Zhang, P Ram, N Baracaldo… - arXiv preprint arXiv …, 2024 - arxiv.org
The need for effective unlearning mechanisms in large language models (LLMs) is
increasingly urgent, driven by the necessity to adhere to data regulations and foster ethical …

Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning

C Fan, J Liu, L Lin, J Jia, R Zhang, S Mei… - arXiv preprint arXiv …, 2024 - arxiv.org
In this work, we address the problem of large language model (LLM) unlearning, aiming to
remove unwanted data influences and associated model capabilities (eg, copyrighted data …

Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models

Y Zhang, X Chen, J Jia, Y Zhang, C Fan, J Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Diffusion models (DMs) have achieved remarkable success in text-to-image generation, but
they also pose safety risks, such as the potential generation of harmful content and copyright …

Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge

Z Zhang, F Wang, X Li, Z Wu, X Tang, H Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have shown remarkable proficiency in generating text,
benefiting from extensive training on vast textual corpora. However, LLMs may also acquire …

MUNBa: Machine Unlearning via Nash Bargaining

J Wu, M Harandi - arXiv preprint arXiv:2411.15537, 2024 - arxiv.org
Machine Unlearning (MU) aims to selectively erase harmful behaviors from models while
retaining the overall utility of the model. As a multi-task learning problem, MU involves …

Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

E Triantafillou, P Kairouz, F Pedregosa, J Hayes… - arXiv preprint arXiv …, 2024 - arxiv.org
We present the findings of the first NeurIPS competition on unlearning, which sought to
stimulate the development of novel algorithms and initiate discussions on formal and robust …