[HTML][HTML] A/B testing: a systematic literature review

F Quin, D Weyns, M Galster, CC Silva - Journal of Systems and Software, 2024 - Elsevier
A/B testing, also referred to as online controlled experimentation or continuous
experimentation, is a form of hypothesis testing where two variants of a piece of software are …

A software engineering perspective on engineering machine learning systems: State of the art and challenges

G Giray - Journal of Systems and Software, 2021 - Elsevier
Context: Advancements in machine learning (ML) lead to a shift from the traditional view of
software development, where algorithms are hard-coded by humans, to ML systems …

A survey of flaky tests

O Parry, GM Kapfhammer, M Hilton… - ACM Transactions on …, 2021 - dl.acm.org
Tests that fail inconsistently, without changes to the code under test, are described as flaky.
Flaky tests do not give a clear indication of the presence of software bugs and thus limit the …

Deepfl: Integrating multiple fault diagnosis dimensions for deep fault localization

X Li, W Li, Y Zhang, L Zhang - Proceedings of the 28th ACM SIGSOFT …, 2019 - dl.acm.org
Learning-based fault localization has been intensively studied recently. Prior studies have
shown that traditional Learning-to-Rank techniques can help precisely diagnose fault …

Bugs. jar: A large-scale, diverse dataset of real-world java bugs

RK Saha, Y Lyu, W Lam, H Yoshida… - Proceedings of the 15th …, 2018 - dl.acm.org
We present Bugs. jar, a large-scale dataset for research in automated debugging, patching,
and testing of Java programs. Bugs. jar is comprised of 1,158 bugs and patches, drawn from …

Trade-offs in continuous integration: assurance, security, and flexibility

M Hilton, N Nelson, T Tunnell, D Marinov… - Proceedings of the 2017 …, 2017 - dl.acm.org
Continuous integration (CI) systems automate the compilation, building, and testing of
software. Despite CI being a widely used activity in software engineering, we do not know …

Sapfix: Automated end-to-end repair at scale

A Marginean, J Bader, S Chandra… - 2019 IEEE/ACM 41st …, 2019 - ieeexplore.ieee.org
We report our experience with SapFix: the first deployment of automated end-to-end fault
fixing, from test case design through to deployed repairs in production code. We have used …

DeFlaker automatically detecting flaky tests

J Bell, O Legunsen, M Hilton, L Eloussi… - Proceedings of the 40th …, 2018 - dl.acm.org
Developers often run tests to check that their latest changes to a code repository did not
break any previously working functionality. Ideally, any new test failures would indicate …

Automated patch correctness assessment: How far are we?

S Wang, M Wen, B Lin, H Wu, Y Qin, D Zou… - Proceedings of the 35th …, 2020 - dl.acm.org
Test-based automated program repair (APR) has attracted huge attention from both industry
and academia. Despite the significant progress made in recent studies, the overfitting …

Understanding flaky tests: The developer's perspective

M Eck, F Palomba, M Castelluccio… - Proceedings of the 2019 …, 2019 - dl.acm.org
Flaky tests are software tests that exhibit a seemingly random outcome (pass or fail) despite
exercising unchanged code. In this work, we examine the perceptions of software …