Agentcoder: Multi-agent-based code generation with iterative testing and optimisation

D Huang, Q Bu, JM Zhang, M Luck, H Cui - arXiv preprint arXiv …, 2023 - arxiv.org
The advancement of natural language processing (NLP) has been significantly boosted by
the development of transformer-based large language models (LLMs). These models have …

A unified debugging approach via llm-based multi-agent synergy

C Lee, CS Xia, L Yang, J Huang, Z Zhu… - arXiv preprint arXiv …, 2024 - arxiv.org
Software debugging is a time-consuming endeavor involving a series of steps, such as fault
localization and patch generation, each requiring thorough analysis and a deep …

Do neutral prompts produce insecure code? formai-v2 dataset: Labelling vulnerabilities in code generated by large language models

N Tihanyi, T Bisztray, MA Ferrag, R Jain… - arXiv preprint arXiv …, 2024 - arxiv.org
This study provides a comparative analysis of state-of-the-art large language models
(LLMs), analyzing how likely they generate vulnerabilities when writing simple C programs …

Fixing code generation errors for large language models

H Wen, Y Zhu, C Liu, X Ren, W Du, M Yan - arXiv preprint arXiv …, 2024 - arxiv.org
Code generation leverages artificial intelligence technologies, particularly Large Language
Models (LLMs), to automatically produce source code, enhancing software development …

Beyond chain-of-thought: A survey of chain-of-x paradigms for llms

Y Xia, R Wang, X Liu, M Li, T Yu, X Chen… - arXiv preprint arXiv …, 2024 - arxiv.org
Chain-of-Thought (CoT) has been a widely adopted prompting method, eliciting impressive
reasoning abilities of Large Language Models (LLMs). Inspired by the sequential thought …

How secure is AI-generated code: a large-scale comparison of large language models

N Tihanyi, T Bisztray, MA Ferrag, R Jain… - Empirical Software …, 2025 - Springer
This study compares state-of-the-art Large Language Models (LLMs) on their tendency to
generate vulnerabilities when writing C programs using a neutral zero-shot prompt. Tihanyi …

Large Language Models as Test Case Generators: Performance Evaluation and Enhancement

K Li, Y Yuan - arXiv preprint arXiv:2404.13340, 2024 - arxiv.org
Code generation with Large Language Models (LLMs) has been extensively studied and
achieved remarkable progress. As a complementary aspect to code generation, test case …

Testing the Effect of Code Documentation on Large Language Model Code Understanding

W Macke, M Doyle - arXiv preprint arXiv:2404.03114, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated impressive abilities in recent years with
regards to code generation and understanding. However, little work has investigated how …

Untangling Knots: Leveraging LLM for Error Resolution in Computational Notebooks

K Grotov, S Titov, Y Zharov, T Bryksin - arXiv preprint arXiv:2405.01559, 2024 - arxiv.org
Computational notebooks became indispensable tools for research-related development,
offering unprecedented interactivity and flexibility in the development process. However …