Large language models meet nl2code: A survey

D Zan, B Chen, F Zhang, D Lu, B Wu, B Guan… - arXiv preprint arXiv …, 2022 - arxiv.org
The task of generating code from a natural language description, or NL2Code, is considered
a pressing and significant challenge in code intelligence. Thanks to the rapid development …

Generating high-precision feedback for programming syntax errors using large language models

T Phung, J Cambronero, S Gulwani, T Kohn… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs), such as Codex, hold great promise in enhancing
programming education by automatically generating feedback for students. We investigate …

Clover: Clo sed-Loop Ver ifiable Code Generation

C Sun, Y Sheng, O Padon, C Barrett - International Symposium on AI …, 2024 - Springer
The use of large language models for code generation is a rapidly growing trend in software
development. However, without effective methods for ensuring the correctness of generated …

Phenomenal yet puzzling: Testing inductive reasoning capabilities of language models with hypothesis refinement

L Qiu, L Jiang, X Lu, M Sclar, V Pyatkin… - arXiv preprint arXiv …, 2023 - arxiv.org
The ability to derive underlying principles from a handful of observations and then
generalize to novel situations--known as inductive reasoning--is central to human …

Cruxeval: A benchmark for code reasoning, understanding and execution

A Gu, B Rozière, H Leather, A Solar-Lezama… - arXiv preprint arXiv …, 2024 - arxiv.org
We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …

[PDF][PDF] Calibration and correctness of language models for code

C Spiess, D Gros, KS Pai, M Pradel… - arXiv preprint arXiv …, 2024 - software-lab.org
Machine learning models are widely used, but can also often be wrong. Users would benefit
from a reliable indication of whether a given output from a given model should be trusted, so …

Llm-assisted code cleaning for training accurate code generators

N Jain, T Zhang, WL Chiang, JE Gonzalez… - arXiv preprint arXiv …, 2023 - arxiv.org
Natural language to code generation is an important application area of LLMs and has
received wide attention from the community. The majority of relevant studies have …

Can large language models transform natural language intent into formal method postconditions?

M Endres, S Fakhoury, S Chakraborty… - Proceedings of the ACM …, 2024 - dl.acm.org
Informal natural language that describes code functionality, such as code comments or
function documentation, may contain substantial information about a program's intent …

Quality and Trust in LLM-generated Code

C Spiess, D Gros, KS Pai, M Pradel… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Abstract Machine learning models are widely used but can also often be wrong. Users
would benefit from a reliable indication of whether a given output from a given model should …

Formalizing natural language intent into program specifications via large language models

M Endres, S Fakhoury, S Chakraborty… - arXiv preprint arXiv …, 2023 - arxiv.org
Informal natural language that describes code functionality, such as code comments or
function documentation, may contain substantial information about a programs intent …