[HTML][HTML] Summary of chatgpt-related research and perspective towards the future of large language models

Y Liu, T Han, S Ma, J Zhang, Y Yang, J Tian, H He, A Li… - Meta-Radiology, 2023 - Elsevier
This paper presents a comprehensive survey of ChatGPT-related (GPT-3.5 and GPT-4)
research, state-of-the-art large language models (LLM) from the GPT series, and their …

A survey of natural language generation

C Dong, Y Li, H Gong, M Chen, J Li, Y Shen… - ACM Computing …, 2022 - dl.acm.org
This article offers a comprehensive review of the research on Natural Language Generation
(NLG) over the past two decades, especially in relation to data-to-text generation and text-to …

Holistic evaluation of language models

P Liang, R Bommasani, T Lee, D Tsipras… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …

G-eval: Nlg evaluation using gpt-4 with better human alignment

Y Liu, D Iter, Y Xu, S Wang, R Xu, C Zhu - arXiv preprint arXiv:2303.16634, 2023 - arxiv.org
The quality of texts generated by natural language generation (NLG) systems is hard to
measure automatically. Conventional reference-based metrics, such as BLEU and ROUGE …

Beyond the imitation game: Quantifying and extrapolating the capabilities of language models

A Srivastava, A Rastogi, A Rao, AAM Shoeb… - arXiv preprint arXiv …, 2022 - arxiv.org
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …

Benchmarking large language models for news summarization

T Zhang, F Ladhak, E Durmus, P Liang… - Transactions of the …, 2024 - direct.mit.edu
Large language models (LLMs) have shown promise for automatic summarization but the
reasons behind their successes are poorly understood. By conducting a human evaluation …

Gptscore: Evaluate as you desire

J Fu, SK Ng, Z Jiang, P Liu - arXiv preprint arXiv:2302.04166, 2023 - arxiv.org
Generative Artificial Intelligence (AI) has enabled the development of sophisticated models
that are capable of producing high-caliber text, images, and other outputs through the …

News summarization and evaluation in the era of gpt-3

T Goyal, JJ Li, G Durrett - arXiv preprint arXiv:2209.12356, 2022 - arxiv.org
The recent success of zero-and few-shot prompting with models like GPT-3 has led to a
paradigm shift in NLP research. In this paper, we study its impact on text summarization …

Is chatgpt a good nlg evaluator? a preliminary study

J Wang, Y Liang, F Meng, Z Sun, H Shi, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, the emergence of ChatGPT has attracted wide attention from the computational
linguistics community. Many prior studies have shown that ChatGPT achieves remarkable …

AdaLoRA: Adaptive budget allocation for parameter-efficient fine-tuning

Q Zhang, M Chen, A Bukharin… - arXiv preprint arXiv …, 2023 - arxiv.org
Fine-tuning large pre-trained language models on downstream tasks has become an
important paradigm in NLP. However, common practice fine-tunes all of the parameters in a …