Natural language generation and understanding of big code for AI-assisted programming: A review

MF Wong, S Guo, CN Hang, SW Ho, CW Tan - Entropy, 2023 - mdpi.com
This paper provides a comprehensive review of the literature concerning the utilization of
Natural Language Processing (NLP) techniques, with a particular focus on transformer …

An empirical comparison of pre-trained models of source code

C Niu, C Li, V Ng, D Chen, J Ge… - 2023 IEEE/ACM 45th …, 2023 - ieeexplore.ieee.org
While a large number of pre-trained models of source code have been successfully
developed and applied to a variety of software engineering (SE) tasks in recent years, our …

Evaluating instruction-tuned large language models on code comprehension and generation

Z Yuan, J Liu, Q Zi, M Liu, X Peng, Y Lou - arXiv preprint arXiv:2308.01240, 2023 - arxiv.org
In this work, we evaluate 10 open-source instructed LLMs on four representative code
comprehension and generation tasks. We have the following main findings. First, for the zero …

Deep learning based code generation methods: A literature review

Z Yang, S Chen, C Gao, Z Li, G Li, R Lv - arXiv preprint arXiv:2303.01056, 2023 - arxiv.org
Code Generation aims at generating relevant code fragments according to given natural
language descriptions. In the process of software development, there exist a large number of …

A survey of neural code intelligence: Paradigms, advances and beyond

Q Sun, Z Chen, F Xu, K Cheng, C Ma, Z Yin… - arXiv preprint arXiv …, 2024 - arxiv.org
Neural Code Intelligence--leveraging deep learning to understand, generate, and optimize
code--holds immense potential for transformative impacts on the whole society. Bridging the …

PTM-APIRec: Leveraging Pre-trained Models of Source Code in API Recommendation

Z Li, C Li, Z Tang, W Huang, J Ge, B Luo, V Ng… - ACM Transactions on …, 2024 - dl.acm.org
Recommending APIs is a practical and essential feature of IDEs. Improving the accuracy of
API recommendations is an effective way to improve coding efficiency. With the success of …

Greening large language models of code

J Shi, Z Yang, HJ Kang, B Xu, J He, D Lo - Proceedings of the 46th …, 2024 - dl.acm.org
Large language models of code have shown remarkable effectiveness across various
software engineering tasks. Despite the availability of many cloud services built upon these …

Cat-probing: A metric-based approach to interpret how pre-trained models for programming language attend code structure

N Chen, Q Sun, R Zhu, X Li, X Lu, M Gao - arXiv preprint arXiv:2210.04633, 2022 - arxiv.org
Code pre-trained models (CodePTMs) have recently demonstrated significant success in
code intelligence. To interpret these models, some probing methods have been applied …

Crosscodebench: Benchmarking cross-task generalization of source code models

C Niu, C Li, V Ng, B Luo - 2023 IEEE/ACM 45th International …, 2023 - ieeexplore.ieee.org
Despite the recent advances showing that a model pre-trained on large-scale source code
data is able to gain appreciable generalization capability, it still requires a sizeable amount …

The earlybird catches the bug: On exploiting early layers of encoder models for more efficient code classification

A Grishina, M Hort, L Moonen - Proceedings of the 31st ACM Joint …, 2023 - dl.acm.org
The use of modern Natural Language Processing (NLP) techniques has shown to be
beneficial for software engineering tasks, such as vulnerability detection and type inference …