R3: Record-Replay-Retroaction for Database-Backed Applications

Q Li, P Kraft, M Cafarella, Ç Demiralp… - Proceedings of the …, 2023 - dl.acm.org
Proceedings of the VLDB Endowment, 2023dl.acm.org
Developers would benefit greatly from time travel: being able to faithfully replay past
executions and retroactively execute modified code on past events. Currently, replay and
retroaction are impractical because they require expensively capturing fine-grained timing
information to reproduce concurrent accesses to shared state. In this paper, we propose
practical time travel for database-backed applications, an important class of distributed
applications that access shared state through transactions. We present R3, a novel Record …
Developers would benefit greatly from time travel: being able to faithfully replay past executions and retroactively execute modified code on past events. Currently, replay and retroaction are impractical because they require expensively capturing fine-grained timing information to reproduce concurrent accesses to shared state. In this paper, we propose practical time travel for database-backed applications, an important class of distributed applications that access shared state through transactions.
We present R3, a novel Record-Replay-Retroaction tool. R3 implements a lightweight interceptor to record concurrency information for applications at transaction-level granularity, enabling replay and retroaction with minimal overhead. We address key challenges in both replay and retroaction. First, we design a novel algorithm for faithfully reproducing application requests running with snapshot isolation, allowing R3 to support most production DBMSs. Second, we develop a retroactive execution mechanism that provides high fidelity with the original trace while supporting nearly arbitrary code modifications. We demonstrate how R3 simplifies debugging for real, hard-to-reproduce concurrency bugs from popular open-source web applications. We evaluate R3 using TPC-C and microservice workloads and show that R3 always-on recording has a small performance overhead (<25% for point queries but <0.1% for complex transactions like in TPC-C) during normal application execution and that R3 can retroactively execute bugfixed code over recorded traces within 0.11--0.78× of the original execution time.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果