Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation

J Liu, CS Xia, Y Wang, L Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc
Program synthesis has been long studied with recent approaches focused on directly using
the power of Large Language Models (LLMs) to generate code. Programming benchmarks …

Effective test generation using pre-trained large language models and mutation testing

AM Dakhel, A Nikanjam, V Majdinasab… - Information and …, 2024 - Elsevier
Context: One of the critical phases in the software development life cycle is software testing.
Testing helps with identifying potential bugs and reducing maintenance costs. The goal of …

Contrastrepair: Enhancing conversation-based automated program repair via contrastive test case pairs

J Kong, M Cheng, X Xie, S Liu, X Du, Q Guo - arXiv preprint arXiv …, 2024 - arxiv.org
Automated Program Repair (APR) aims to automatically generate patches for rectifying
software bugs. Recent strides in Large Language Models (LLM), such as ChatGPT, have …

Test Data Generation for Mutation Testing Based on Markov Chain Usage Model and Estimation of Distribution Algorithm

C Wei, X Yao, D Gong, H Liu - IEEE Transactions on Software …, 2024 - ieeexplore.ieee.org
Mutation testing, a mainstream fault-based software testing technique, can mimic a wide
variety of software faults by seeding them into the target program and resulting in the so …

ConstraintFlow: A DSL for Specification and Verification of Neural Network Analyses

A Singh, Y Sarita, C Mendis, G Singh - arXiv preprint arXiv:2403.18729, 2024 - arxiv.org
The uninterpretability of DNNs hinders their deployment to safety-critical applications.
Recent works have shown that Abstract-Interpretation-based formal certification techniques …

CriticalFuzz: A critical neuron coverage-guided fuzz testing framework for deep neural networks

T Bai, S Huang, Y Huang, X Wang, C Xia, Y Qu… - Information and …, 2024 - Elsevier
Context: Deep neural networks (DNN) have been widely deployed in safety-critical domains,
such as autonomous cars and healthcare, where error behaviors can lead to serious …

Marco: A Stochastic Asynchronous Concolic Explorer

J Hu, Y Duan, H Yin - Proceedings of the 46th IEEE/ACM International …, 2024 - dl.acm.org
Concolic execution is a powerful program analysis technique for code path exploration.
Despite recent advances that greatly improved the efficiency of concolic execution engines …

WeBridge: Synthesizing Stored Procedures for Large-Scale Real-World Web Applications

G Hu, Z Wang, C Tang, J Shen, Z Dong, S Yao… - Proceedings of the …, 2024 - dl.acm.org
Modern web applications use databases to store their data. When processing user requests,
these applications retrieve and store data in the database server, which incurs network …

Knowledge transfer based many-objective approach for finding bugs in multi-path loops

SD Semujju, F Liu, H Huang, Y Xiang, X Yan… - Complex & Intelligent …, 2024 - Springer
Generating test cases is essential for discovering software bugs. However, finding bugs in
multi-path loops is challenging, especially when bugs can only be exposed after a specific …

[HTML][HTML] Testing concolic execution through consistency checks

E Coppa, A Izzillo - Journal of Systems and Software, 2024 - Elsevier
Symbolic execution is a well-known software testing technique that evaluates how a
program runs when considering a symbolic input, ie, an input that can initially assume any …