A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

L Lin, H Mu, Z Zhai, M Wang, Y Wang, R Wang… - Journal of Artificial …, 2025 - jair.org
Generative models are rapidly gaining popularity and being integrated into everyday
applications, raising concerns over their safe use as various vulnerabilities are exposed. In …

Adversarial attacks and defenses for large language models (LLMs): methods, frameworks & challenges

P Kumar - International Journal of Multimedia Information …, 2024 - Springer
Large language models (LLMs) have exhibited remarkable efficacy and proficiency in a
wide array of NLP endeavors. Nevertheless, concerns are growing rapidly regarding the …

Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers

Y Jiang, G Rajendran, P Ravikumar… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have the capacity to store and recall facts. Through
experimentation with open-source models, we observe that this ability to retrieve facts can …

TF-Attack: Transferable and fast adversarial attacks on large language models

Z Li, K Chen, L Liu, X Bai, M Yang, Y Xiang… - Knowledge-Based …, 2025 - Elsevier
With the great advancements in large language models (LLMs), adversarial attacks against
LLMs have recently attracted increasing attention. We found that pre-existing adversarial …

[HTML][HTML] On large language models safety, security, and privacy: A survey

R Zhang, HW Li, XY Qian, WB Jiang… - Journal of Electronic …, 2025 - Elsevier
The integration of artificial intelligence (AI) technology, particularly large language models
(LLMs), has become essential across various sectors due to their advanced language …

Adversarial attacks on large language models

J Zou, S Zhang, M Qiu - International Conference on Knowledge Science …, 2024 - Springer
Abstract Large Language Models (LLMs) have rapidly advanced and garnered increasing
attention due to their remarkable capabilities across various applications. However …

Flipattack: Jailbreak llms via flipping

Y Liu, X He, M Xiong, J Fu, S Deng, B Hooi - arXiv preprint arXiv …, 2024 - arxiv.org
This paper proposes a simple yet effective jailbreak attack named FlipAttack against black-
box LLMs. First, from the autoregressive nature, we reveal that LLMs tend to understand the …

Transferable Adversarial Attacks on SAM and Its Downstream Models

S Xia, W Yang, Y Yu, X Lin, H Ding, L Duan… - arXiv preprint arXiv …, 2024 - arxiv.org
The utilization of large foundational models has a dilemma: while fine-tuning downstream
tasks from them holds promise for making use of the well-generalized knowledge in practical …

Securing vision-language models with a robust encoder against jailbreak and adversarial attacks

MZ Hossain, A Imteaj - 2024 IEEE International Conference on …, 2024 - ieeexplore.ieee.org
Large Vision-Language Models (LVLMs), trained on multimodal big datasets, have
significantly advanced AI by excelling in vision-language tasks. However, these models …