- 学术资源搜索

Large language models meet nl2code: A survey

D Zan, B Chen, F Zhang, D Lu, B Wu, B Guan… - arXiv preprint arXiv …, 2022 - arxiv.org

The task of generating code from a natural language description, or NL2Code, is considered
a pressing and significant challenge in code intelligence. Thanks to the rapid development …

被引用次数：151 相关文章所有 5 个版本

[PDF] arxiv.org

Generating high-precision feedback for programming syntax errors using large language models

T Phung, J Cambronero, S Gulwani, T Kohn… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs), such as Codex, hold great promise in enhancing
programming education by automatically generating feedback for students. We investigate …

被引用次数：75 相关文章所有 7 个版本

[PDF] arxiv.org

Clover: Clo sed-Loop Ver ifiable Code Generation

C Sun, Y Sheng, O Padon, C Barrett - International Symposium on AI …, 2024 - Springer

The use of large language models for code generation is a rapidly growing trend in software
development. However, without effective methods for ensuring the correctness of generated …

被引用次数：30 相关文章所有 4 个版本

[PDF] arxiv.org

Phenomenal yet puzzling: Testing inductive reasoning capabilities of language models with hypothesis refinement

L Qiu, L Jiang, X Lu, M Sclar, V Pyatkin… - arXiv preprint arXiv …, 2023 - arxiv.org

The ability to derive underlying principles from a handful of observations and then
generalize to novel situations--known as inductive reasoning--is central to human …

被引用次数：41 相关文章所有 3 个版本

[PDF] arxiv.org

Cruxeval: A benchmark for code reasoning, understanding and execution

A Gu, B Rozière, H Leather, A Solar-Lezama… - arXiv preprint arXiv …, 2024 - arxiv.org

We present CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation), a
benchmark consisting of 800 Python functions (3-13 lines). Each function comes with an …

被引用次数：52 相关文章所有 5 个版本

[PDF] software-lab.org

[PDF][PDF] Calibration and correctness of language models for code

C Spiess, D Gros, KS Pai, M Pradel… - arXiv preprint arXiv …, 2024 - software-lab.org

Machine learning models are widely used, but can also often be wrong. Users would benefit
from a reliable indication of whether a given output from a given model should be trusted, so …

被引用次数：18 相关文章

[PDF] arxiv.org

Llm-assisted code cleaning for training accurate code generators

N Jain, T Zhang, WL Chiang, JE Gonzalez… - arXiv preprint arXiv …, 2023 - arxiv.org

Natural language to code generation is an important application area of LLMs and has
received wide attention from the community. The majority of relevant studies have …

被引用次数：20 相关文章所有 3 个版本

[PDF] acm.org

Can large language models transform natural language intent into formal method postconditions?

M Endres, S Fakhoury, S Chakraborty… - Proceedings of the ACM …, 2024 - dl.acm.org

Informal natural language that describes code functionality, such as code comments or
function documentation, may contain substantial information about a program's intent …

被引用次数：11 相关文章所有 2 个版本

Quality and Trust in LLM-generated Code

C Spiess, D Gros, KS Pai, M Pradel… - arXiv e …, 2024 - ui.adsabs.harvard.edu

Abstract Machine learning models are widely used but can also often be wrong. Users
would benefit from a reliable indication of whether a given output from a given model should …

被引用次数：17 相关文章

[PDF] arxiv.org

Formalizing natural language intent into program specifications via large language models

M Endres, S Fakhoury, S Chakraborty… - arXiv preprint arXiv …, 2023 - arxiv.org

Informal natural language that describes code functionality, such as code comments or
function documentation, may contain substantial information about a programs intent …

被引用次数：11 相关文章所有 2 个版本

高级搜索

QQ 群

Large language models meet nl2code: A survey

Generating high-precision feedback for programming syntax errors using large language models

Clover: Clo sed-Loop Ver ifiable Code Generation

Phenomenal yet puzzling: Testing inductive reasoning capabilities of language models with hypothesis refinement

Cruxeval: A benchmark for code reasoning, understanding and execution

[PDF][PDF] Calibration and correctness of language models for code

Llm-assisted code cleaning for training accurate code generators

Can large language models transform natural language intent into formal method postconditions?

Quality and Trust in LLM-generated Code

Formalizing natural language intent into program specifications via large language models

引用