Large language models for software engineering: A systematic literature review

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2024 - dl.acm.org
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt

Y Cao, S Li, Y Liu, Z Yan, Y Dai, PS Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, ChatGPT, along with DALL-E-2 and Codex, has been gaining significant attention
from society. As a result, many individuals have become interested in related resources and …

Codegen: An open large language model for code with multi-turn program synthesis

E Nijkamp, B Pang, H Hayashi, L Tu, H Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
Program synthesis strives to generate a computer program as a solution to a given problem
specification, expressed with input-output examples or natural language descriptions. The …

A systematic evaluation of large language models of code

FF Xu, U Alon, G Neubig, VJ Hellendoorn - Proceedings of the 6th ACM …, 2022 - dl.acm.org
Large language models (LMs) of code have recently shown tremendous promise in
completing code and synthesizing code from natural language descriptions. However, the …

The stack: 3 tb of permissively licensed source code

D Kocetkov, R Li, LB Allal, J Li, C Mou… - arXiv preprint arXiv …, 2022 - arxiv.org
Large Language Models (LLMs) play an ever-increasing role in the field of Artificial
Intelligence (AI)--not only for natural language processing but also for code understanding …

Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation

Y Wang, W Wang, S Joty, SCH Hoi - arXiv preprint arXiv:2109.00859, 2021 - arxiv.org
Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently
shown to transfer well to Programming Languages (PL) and largely benefit a broad set of …

Program synthesis with large language models

J Austin, A Odena, M Nye, M Bosma… - arXiv preprint arXiv …, 2021 - arxiv.org
This paper explores the limits of the current generation of large language models for
program synthesis in general purpose programming languages. We evaluate a collection of …

SantaCoder: don't reach for the stars!

LB Allal, R Li, D Kocetkov, C Mou, C Akiki… - arXiv preprint arXiv …, 2023 - arxiv.org
The BigCode project is an open-scientific collaboration working on the responsible
development of large language models for code. This tech report describes the progress of …

The power of generative ai: A review of requirements, models, input–output formats, evaluation metrics, and challenges

A Bandi, PVSR Adapa, YEVPK Kuchi - Future Internet, 2023 - mdpi.com
Generative artificial intelligence (AI) has emerged as a powerful technology with numerous
applications in various domains. There is a need to identify the requirements and evaluation …

An extensive study on pre-trained models for program understanding and generation

Z Zeng, H Tan, H Zhang, J Li, Y Zhang… - Proceedings of the 31st …, 2022 - dl.acm.org
Automatic program understanding and generation techniques could significantly advance
the productivity of programmers and have been widely studied by academia and industry …