Repairing neural networks by leaving the right past behind

S Wang, Y Zhu, H Liu, Z Zheng, C Chen, J Li - ACM Computing Surveys, 2024 - dl.acm.org

Large Language Models (LLMs) have recently transformed both the academic and industrial
landscapes due to their remarkable capacity to understand, analyze, and generate texts …

被引用次数：96 相关文章所有 2 个版本

[PDF] thecvf.com

Ablating concepts in text-to-image diffusion models

N Kumari, B Zhang, SY Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Large-scale text-to-image diffusion models can generate high-fidelity images with powerful
compositional ability. However, these models are typically trained on an enormous amount …

被引用次数：167 相关文章所有 6 个版本

[PDF] mdpi.com

Explainable image classification: The journey so far and the road ahead

V Kamakshi, NC Krishnan - AI, 2023 - mdpi.com

Explainable Artificial Intelligence (XAI) has emerged as a crucial research area to address
the interpretability challenges posed by complex machine learning models. In this survey …

被引用次数：14 相关文章所有 5 个版本

[PDF] arxiv.org

Machine unlearning: Solutions and challenges

J Xu, Z Wu, C Wang, X Jia - IEEE Transactions on Emerging …, 2024 - ieeexplore.ieee.org

Machine learning models may inadvertently memorize sensitive, unauthorized, or malicious
data, posing risks of privacy breaches, security vulnerabilities, and performance …

被引用次数：49 相关文章所有 5 个版本

[PDF] arxiv.org

Can sensitive information be deleted from llms? objectives for defending against extraction attacks

V Patil, P Hase, M Bansal - arXiv preprint arXiv:2309.17410, 2023 - arxiv.org

Pretrained language models sometimes possess knowledge that we do not wish them to,
including memorized personal information and knowledge that could be used to harm …

被引用次数：65 相关文章所有 4 个版本

[PDF] mit.edu

Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies

L Pan, M Saxon, W Xu, D Nathani, X Wang… - Transactions of the …, 2024 - direct.mit.edu

While large language models (LLMs) have shown remarkable effectiveness in various NLP
tasks, they are still prone to issues such as hallucination, unfaithful reasoning, and toxicity. A …

被引用次数：47 相关文章所有 4 个版本

[PDF] neurips.cc

The Memory-Perturbation Equation: Understanding Model's Sensitivity to Data

P Nickl, L Xu, D Tailor, T Möllenhoff… - Advances in Neural …, 2024 - proceedings.neurips.cc

Understanding model's sensitivity to its training data is crucial but can also be challenging
and costly, especially during training. To simplify such issues, we present the Memory …

被引用次数：6 相关文章所有 8 个版本

[PDF] arxiv.org

Cross-lingual editing in multilingual language models

H Beniwal, M Singh - arXiv preprint arXiv:2401.10521, 2024 - arxiv.org

The training of large language models (LLMs) necessitates substantial data and
computational resources, and updating outdated LLMs entails significant efforts and …

被引用次数：9 相关文章所有 3 个版本

[PDF] arxiv.org

Adapt then unlearn: Exploiting parameter space semantics for unlearning in generative adversarial networks

P Tiwary, A Guha, S Panda - arXiv preprint arXiv:2309.14054, 2023 - arxiv.org

The increased attention to regulating the outputs of deep generative models, driven by
growing concerns about privacy and regulatory compliance, has highlighted the need for …

被引用次数：5 相关文章所有 2 个版本

[PDF] neurips.cc

Retaining Beneficial Information from Detrimental Data for Neural Network Repair

LK Huang, P Zhao, J Huang… - Advances in Neural …, 2024 - proceedings.neurips.cc

The performance of deep learning models heavily relies on the quality of the training data.
Inadequacies in the training data, such as corrupt input or noisy labels, can lead to the …

高级搜索

QQ 群