An efficient sequential consistency implementation with dynamic race detection for GPUs

A Tabbakh, M Annavaram - Journal of Parallel and Distributed Computing, 2024 - Elsevier
… bandwidth and L2 cache traffic by nearly 5% compared to G-TSC … Cost-effective order-recording
and data race detection mechanism (CORD) [23] is a hardware-assisted race detection