Large language models for software engineering: A systematic literature review

X Hou, Y Zhao, Y Liu, Z Yang, K Wang, L Li… - ACM Transactions on …, 2023 - dl.acm.org
Large Language Models (LLMs) have significantly impacted numerous domains, including
Software Engineering (SE). Many recent publications have explored LLMs applied to …

A software engineering perspective on engineering machine learning systems: State of the art and challenges

G Giray - Journal of Systems and Software, 2021 - Elsevier
Context: Advancements in machine learning (ML) lead to a shift from the traditional view of
software development, where algorithms are hard-coded by humans, to ML systems …

A survey of flaky tests

O Parry, GM Kapfhammer, M Hilton… - ACM Transactions on …, 2021 - dl.acm.org
Tests that fail inconsistently, without changes to the code under test, are described as flaky.
Flaky tests do not give a clear indication of the presence of software bugs and thus limit the …

Deepfl: Integrating multiple fault diagnosis dimensions for deep fault localization

X Li, W Li, Y Zhang, L Zhang - Proceedings of the 28th ACM SIGSOFT …, 2019 - dl.acm.org
Learning-based fault localization has been intensively studied recently. Prior studies have
shown that traditional Learning-to-Rank techniques can help precisely diagnose fault …

Interactive code generation via test-driven user-intent formalization

SK Lahiri, S Fakhoury, A Naik, G Sakkas… - arXiv preprint arXiv …, 2022 - arxiv.org
Large language models (LLMs) have shown great potential in automating significant
aspects of coding by producing natural code from informal natural language (NL) intent …

Trade-offs in continuous integration: assurance, security, and flexibility

M Hilton, N Nelson, T Tunnell, D Marinov… - Proceedings of the 2017 …, 2017 - dl.acm.org
Continuous integration (CI) systems automate the compilation, building, and testing of
software. Despite CI being a widely used activity in software engineering, we do not know …

DeFlaker automatically detecting flaky tests

J Bell, O Legunsen, M Hilton, L Eloussi… - Proceedings of the 40th …, 2018 - dl.acm.org
Developers often run tests to check that their latest changes to a code repository did not
break any previously working functionality. Ideally, any new test failures would indicate …

Automated patch correctness assessment: How far are we?

S Wang, M Wen, B Lin, H Wu, Y Qin, D Zou… - Proceedings of the 35th …, 2020 - dl.acm.org
Test-based automated program repair (APR) has attracted huge attention from both industry
and academia. Despite the significant progress made in recent studies, the overfitting …

Bugs. jar: A large-scale, diverse dataset of real-world java bugs

RK Saha, Y Lyu, W Lam, H Yoshida… - Proceedings of the 15th …, 2018 - dl.acm.org
We present Bugs. jar, a large-scale dataset for research in automated debugging, patching,
and testing of Java programs. Bugs. jar is comprised of 1,158 bugs and patches, drawn from …

Taming Google-scale continuous testing

A Memon, Z Gao, B Nguyen, S Dhanda… - 2017 IEEE/ACM 39th …, 2017 - ieeexplore.ieee.org
Growth in Google's code size and feature churn rate has seen increased reliance on
continuous integration (CI) and testing to maintain quality. Even with enormous resources …