Fast model debias with machine unlearning

S Liu, Y Yao, J Jia, S Casper, N Baracaldo… - arXiv preprint arXiv …, 2024 - arxiv.org

We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …

被引用次数：95 相关文章所有 2 个版本

[PDF] arxiv.org

Challenging forgets: Unveiling the worst-case forget sets in machine unlearning

C Fan, J Liu, A Hero, S Liu - European Conference on Computer Vision, 2025 - Springer

The trustworthy machine learning (ML) community is increasingly recognizing the crucial
need for models capable of selectively 'unlearning'data points after training. This leads to the …

被引用次数：15 相关文章所有 2 个版本

[PDF] arxiv.org

Learning to unlearn for robust machine unlearning

MH Huang, LG Foo, J Liu - European Conference on Computer Vision, 2025 - Springer

Abstract Machine unlearning (MU) seeks to remove knowledge of specific data samples
from trained models without the necessity for complete retraining, a task made challenging …

被引用次数：5 相关文章所有 9 个版本

[PDF] arxiv.org

Biasalert: A plug-and-play tool for social bias detection in llms

Z Fan, R Chen, R Xu, Z Liu - arXiv preprint arXiv:2407.10241, 2024 - arxiv.org

Evaluating the bias in Large Language Models (LLMs) becomes increasingly crucial with
their rapid development. However, existing evaluation methods rely on fixed-form outputs …

被引用次数：6 相关文章所有 4 个版本

[PDF] arxiv.org

Editable fairness: Fine-grained bias mitigation in language models

R Chen, Y Li, J Yang, JT Zhou, Z Liu - arXiv preprint arXiv:2408.11843, 2024 - arxiv.org

Generating fair and accurate predictions plays a pivotal role in deploying large language
models (LLMs) in the real world. However, existing debiasing methods inevitably generate …

被引用次数：3 相关文章所有 3 个版本

[PDF] uml.edu

Multidelete for multimodal machine unlearning

J Cheng, H Amiri - European Conference on Computer Vision, 2025 - Springer

Abstract Machine Unlearning removes specific knowledge about training data samples from
an already trained model. It has significant practical benefits, such as purging private …

被引用次数：4 相关文章所有 5 个版本

[PDF] arxiv.org

CURE4Rec: A benchmark for recommendation unlearning with deeper influence

C Chen, J Zhang, Y Zhang, L Zhang, L Lyu, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org

With increasing privacy concerns in artificial intelligence, regulations have mandated the
right to be forgotten, granting individuals the right to withdraw their data from models …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Modality-fair preference optimization for trustworthy mllm alignment

S Jiang, Y Zhang, R Chen, Y Jin, Z Liu - arXiv preprint arXiv:2410.15334, 2024 - arxiv.org

Direct Preference Optimization (DPO) is effective for aligning large language models (LLMs),
but when applied to multimodal models (MLLMs), it often favors text over image information …

被引用次数：2 相关文章所有 2 个版本

Measuring and Mitigating Stereotype Bias in Language Models: An Overview of Debiasing Techniques

Z Sokolová, M Harahus, J Staš… - 2024 International …, 2024 - ieeexplore.ieee.org

This paper provides an overview of methods for measuring the stereotype bias of pre-trained
language models. It explains the term 'stereotype bias' and its measurement. A …

被引用次数：1 相关文章

[PDF] arxiv.org

Mu-bench: A multitask multimodal benchmark for machine unlearning

J Cheng, H Amiri - arXiv preprint arXiv:2406.14796, 2024 - arxiv.org

Recent advancements in Machine Unlearning (MU) have introduced solutions to selectively
remove certain training samples, such as those with outdated or sensitive information, from …

被引用次数：2 相关文章所有 5 个版本

高级搜索

QQ 群