M Diao, R Li, S Liu,
G Liao, J Wang, X Cai… - arXiv preprint arXiv …, 2024 - arxiv.org
As large language models (LLMs) continue to advance in capability and influence, ensuring
their security and preventing harmful outputs has become crucial. A promising approach to …