Large language models for software engineering: A systematic literature review

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2024 - dl.acm.org
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM Transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Is ChatGPT a good sentiment analyzer? A preliminary study

Z Wang, Q Xie, Y Feng, Z Ding, Z Yang… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, ChatGPT has drawn great attention from both the research community and the
public. We are particularly interested in whether it can serve as a universal sentiment …

A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity

Y Bang, S Cahyawijaya, N Lee, W Dai, D Su… - arXiv preprint arXiv …, 2023 - arxiv.org
This paper proposes a framework for quantitatively evaluating interactive LLMs such as
ChatGPT using publicly available data sets. We carry out an extensive technical evaluation …

Chatgpt is a knowledgeable but inexperienced solver: An investigation of commonsense problem in large language models

N Bian, X Han, L Sun, H Lin, Y Lu, B He, S Jiang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have made significant progress in NLP. However, their
ability to memorize, represent, and leverage commonsense knowledge has been a well …

GPTEval: A survey on assessments of ChatGPT and GPT-4

R Mao, G Chen, X Zhang, F Guerin… - arXiv preprint arXiv …, 2023 - arxiv.org
The emergence of ChatGPT has generated much speculation in the press about its potential
to disrupt social and economic systems. Its astonishing language ability has aroused strong …

Deep transfer learning for automatic speech recognition: Towards better generalization

H Kheddar, Y Himeur, S Al-Maadeed, A Amira… - Knowledge-Based …, 2023 - Elsevier
Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …

Large language models for cyber security: A systematic literature review

HX Xu, SA Wang, N Li, K Wang, Y Zhao, K Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid advancement of Large Language Models (LLMs) has opened up new
opportunities for leveraging artificial intelligence in various domains, including cybersecurity …

Flask: Fine-grained language model evaluation based on alignment skill sets

S Ye, D Kim, S Kim, H Hwang, S Kim, Y Jo… - arXiv preprint arXiv …, 2023 - arxiv.org
Evaluation of Large Language Models (LLMs) is challenging because aligning to human
values requires the composition of multiple skills and the required set of skills varies …

[HTML][HTML] A comprehensive evaluation of large language models on benchmark biomedical text processing tasks

I Jahan, MTR Laskar, C Peng, JX Huang - Computers in biology and …, 2024 - Elsevier
Abstract Recently, Large Language Models (LLMs) have demonstrated impressive
capability to solve a wide range of tasks. However, despite their success across various …