Knowledge editing for large language models: A survey

S Wang, Y Zhu, H Liu, Z Zheng, C Chen, J Li - ACM Computing Surveys, 2024 - dl.acm.org
Large Language Models (LLMs) have recently transformed both the academic and industrial
landscapes due to their remarkable capacity to understand, analyze, and generate texts …

Ablating concepts in text-to-image diffusion models

N Kumari, B Zhang, SY Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale text-to-image diffusion models can generate high-fidelity images with powerful
compositional ability. However, these models are typically trained on an enormous amount …

Explainable image classification: The journey so far and the road ahead

V Kamakshi, NC Krishnan - AI, 2023 - mdpi.com
Explainable Artificial Intelligence (XAI) has emerged as a crucial research area to address
the interpretability challenges posed by complex machine learning models. In this survey …

Machine unlearning: Solutions and challenges

J Xu, Z Wu, C Wang, X Jia - IEEE Transactions on Emerging …, 2024 - ieeexplore.ieee.org
Machine learning models may inadvertently memorize sensitive, unauthorized, or malicious
data, posing risks of privacy breaches, security vulnerabilities, and performance …

Can sensitive information be deleted from llms? objectives for defending against extraction attacks

V Patil, P Hase, M Bansal - arXiv preprint arXiv:2309.17410, 2023 - arxiv.org
Pretrained language models sometimes possess knowledge that we do not wish them to,
including memorized personal information and knowledge that could be used to harm …

Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies

L Pan, M Saxon, W Xu, D Nathani, X Wang… - Transactions of the …, 2024 - direct.mit.edu
While large language models (LLMs) have shown remarkable effectiveness in various NLP
tasks, they are still prone to issues such as hallucination, unfaithful reasoning, and toxicity. A …

The Memory-Perturbation Equation: Understanding Model's Sensitivity to Data

P Nickl, L Xu, D Tailor, T Möllenhoff… - Advances in Neural …, 2024 - proceedings.neurips.cc
Understanding model's sensitivity to its training data is crucial but can also be challenging
and costly, especially during training. To simplify such issues, we present the Memory …

Cross-lingual editing in multilingual language models

H Beniwal, M Singh - arXiv preprint arXiv:2401.10521, 2024 - arxiv.org
The training of large language models (LLMs) necessitates substantial data and
computational resources, and updating outdated LLMs entails significant efforts and …

Adapt then unlearn: Exploiting parameter space semantics for unlearning in generative adversarial networks

P Tiwary, A Guha, S Panda - arXiv preprint arXiv:2309.14054, 2023 - arxiv.org
The increased attention to regulating the outputs of deep generative models, driven by
growing concerns about privacy and regulatory compliance, has highlighted the need for …

Retaining Beneficial Information from Detrimental Data for Neural Network Repair

LK Huang, P Zhao, J Huang… - Advances in Neural …, 2024 - proceedings.neurips.cc
The performance of deep learning models heavily relies on the quality of the training data.
Inadequacies in the training data, such as corrupt input or noisy labels, can lead to the …