A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

A survey on large language model based autonomous agents

L Wang, C Ma, X Feng, Z Zhang, H Yang… - Frontiers of Computer …, 2024 - Springer
Autonomous agents have long been a research focus in academic and industry
communities. Previous research often focuses on training agents with limited knowledge …

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Promptbench: Towards evaluating the robustness of large language models on adversarial prompts

K Zhu, J Wang, J Zhou, Z Wang, H Chen… - arXiv e …, 2023 - ui.adsabs.harvard.edu
The increasing reliance on Large Language Models (LLMs) across academia and industry
necessitates a comprehensive understanding of their robustness to prompts. In response to …

Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

MU Hadi, Q Al Tashi, A Shah, R Qureshi… - Authorea …, 2024 - authorea.com
Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

On evaluating adversarial robustness of large vision-language models

Y Zhao, T Pang, C Du, X Yang, C Li… - Advances in …, 2024 - proceedings.neurips.cc
Large vision-language models (VLMs) such as GPT-4 have achieved unprecedented
performance in response generation, especially with visual inputs, enabling more creative …

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press
Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Trustworthy LLMs: A survey and guideline for evaluating large language models' alignment

Y Liu, Y Yao, JF Ton, X Zhang, RGH Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Ensuring alignment, which refers to making models behave in accordance with human
intentions [1, 2], has become a critical task before deploying large language models (LLMs) …

Towards an understanding of large language models in software engineering tasks

Z Zheng, K Ning, Q Zhong, J Chen, W Chen… - Empirical Software …, 2025 - Springer
Abstract Large Language Models (LLMs) have drawn widespread attention and research
due to their astounding performance in text generation and reasoning tasks. Derivative …