Harmful fine-tuning attacks and defenses for large language models: A survey

T Huang, S Hu, F Ilhan, SF Tekin, L Liu - arXiv preprint arXiv:2409.18169, 2024 - arxiv.org
Recent research demonstrates that the nascent fine-tuning-as-a-service business model
exposes serious safety concerns--fine-tuning over a few harmful data uploaded by the users …

Artwork protection against neural style transfer using locally adaptive adversarial color attack

Z Guo, J Dong, Y Qian, K Wang, W Li, Z Guo… - ECAI 2024, 2024 - ebooks.iospress.nl
Neural style transfer (NST) generates new images by combining the style of one image with
the content of another. However, unauthorized NST can exploit artwork, raising concerns …

Clean-label backdoor attack and defense: An examination of language model vulnerability

S Zhao, X Xu, L Xiao, J Wen, LA Tuan - Expert Systems with Applications, 2025 - Elsevier
Prompt-based learning, a paradigm that creates a bridge between pre-training and fine-
tuning stages, has proven to be highly effective concerning various NLP tasks, particularly in …

[HTML][HTML] On large language models safety, security, and privacy: A survey

R Zhang, HW Li, XY Qian, WB Jiang… - Journal of Electronic …, 2025 - Elsevier
The integration of artificial intelligence (AI) technology, particularly large language models
(LLMs), has become essential across various sectors due to their advanced language …

A grey-box attack against latent diffusion model-based image editing by posterior collapse

Z Guo, L Fang, J Lin, Y Qian, S Zhao, Z Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advancements in generative AI, particularly Latent Diffusion Models (LDMs), have
revolutionized image synthesis and manipulation. However, these generative techniques …

Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies

K Dunnett, R Arablouei, D Miller, V Dedeoglu… - arXiv preprint arXiv …, 2024 - arxiv.org
The widespread adoption of deep learning across various industries has introduced
substantial challenges, particularly in terms of model explainability and security. The …

Quantized Delta Weight Is Safety Keeper

Y Liu, Z Sun, X He, X Huang - arXiv preprint arXiv:2411.19530, 2024 - arxiv.org
Recent advancements in fine-tuning proprietary language models enable customized
applications across various domains but also introduce two major challenges: high resource …

Trading Devil Final: Backdoor attack via Stock market and Bayesian Optimization

O Mengara - arXiv preprint arXiv:2407.14573, 2024 - arxiv.org
Since the advent of generative artificial intelligence, every company and researcher has
been rushing to develop their own generative models, whether commercial or not. Given the …

Exploring Potential Prompt Injection Attacks in Federated Military LLMs and Their Mitigation

Y Lee, T Park, Y Lee, J Gong, J Kang - arXiv preprint arXiv:2501.18416, 2025 - arxiv.org
Federated Learning (FL) is increasingly being adopted in military collaborations to develop
Large Language Models (LLMs) while preserving data sovereignty. However, prompt …

Towards effective neural topic modeling

X Wu - 2024 - dr.ntu.edu.sg
Over the past few decades, the world has witnessed an unprecedented explosion of
information. Of these, a substantial portion consists of unlabeled textual data, such as …