- 学术资源搜索

Software testing with large language models: Survey, landscape, and vision

J Wang, Y Huang, C Chen, Z Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Pre-trained large language models (LLMs) have recently emerged as a breakthrough
technology in natural language processing and artificial intelligence, with the ability to …

被引用次数：230 相关文章所有 7 个版本

[PDF] arxiv.org

A critical review of large language model on software engineering: An example from chatgpt and automated program repair

Q Zhang, T Zhang, J Zhai, C Fang, B Yu, W Sun… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have been gaining increasing attention and demonstrated
promising performance across a variety of Software Engineering (SE) tasks, such as …

被引用次数：64 相关文章所有 3 个版本

[PDF] arxiv.org

A systematic literature review on large language models for automated program repair

Q Zhang, C Fang, Y Xie, YX Ma, W Sun, Y Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

Automated Program Repair (APR) attempts to patch software bugs and reduce manual
debugging efforts. Very recently, with the advances in Large Language Models (LLMs), an …

被引用次数：18 相关文章所有 2 个版本

[PDF] arxiv.org

Agent-as-a-Judge: Evaluate Agents with Agents

M Zhuge, C Zhao, D Ashley, W Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Contemporary evaluation techniques are inadequate for agentic systems. These
approaches either focus exclusively on final outcomes--ignoring the step-by-step nature of …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

Cigar: Cost-efficient program repair with llms

D Hidvégi, K Etemadi, S Bobadilla… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLM) have proven to be effective at automated program repair
(APR). However, using LLMs can be highly costly, with companies invoicing users by the …

被引用次数：15 相关文章所有 2 个版本

[PDF] neurips.cc

Large language models of code fail at completing code with potential bugs

T Dinh, J Zhao, S Tan, R Negrinho… - Advances in …, 2024 - proceedings.neurips.cc

Large language models of code (Code-LLMs) have recently brought tremendous advances
to code completion, a fundamental feature of programming assistance and code …

被引用次数：19 相关文章所有 8 个版本

[PDF] arxiv.org

Mdeval: Massively multilingual code debugging

S Liu, L Chai, J Yang, J Shi, H Zhu, L Wang… - arXiv preprint arXiv …, 2024 - arxiv.org

Code large language models (LLMs) have made significant progress in code debugging by
directly generating the correct code based on the buggy code snippet. Programming …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Super: Evaluating agents on setting up and executing tasks from research repositories

B Bogin, K Yang, S Gupta, K Richardson… - arXiv preprint arXiv …, 2024 - arxiv.org

Given that Large Language Models (LLMs) have made significant progress in writing code,
can they now be used to autonomously reproduce results from research repositories? Such …

被引用次数：2 相关文章所有 3 个版本

[PDF] aaai.org

Better context makes better code language models: A case study on function call argument completion

H Pei, J Zhao, L Lausen, S Zha, G Karypis - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Pretrained code language models have enabled great progress towards program synthesis.
However, common approaches only consider in-file local context and thus miss information …

被引用次数：19 相关文章所有 7 个版本

[PDF] aclanthology.org

INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair

H Wang, Z Liu, S Wang, G Cui, N Ding… - Findings of the …, 2024 - aclanthology.org

This paper introduces INTERVENOR (INTERactiVE chaiN Of Repair), a system designed to
emulate the interactive code repair processes observed in humans, encompassing both …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群

Software testing with large language models: Survey, landscape, and vision

A critical review of large language model on software engineering: An example from chatgpt and automated program repair

A systematic literature review on large language models for automated program repair

Agent-as-a-Judge: Evaluate Agents with Agents

Cigar: Cost-efficient program repair with llms

Large language models of code fail at completing code with potential bugs

Mdeval: Massively multilingual code debugging

Super: Evaluating agents on setting up and executing tasks from research repositories

Better context makes better code language models: A case study on function call argument completion

INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair

引用