The efficacy of error mitigation techniques for DRAM retention failures: A comparative experimental study

S Khan, D Lee, Y Kim, AR Alameldeen… - ACM SIGMETRICS …, 2014 - dl.acm.org
As DRAM cells continue to shrink, they become more susceptible to retention failures.
DRAM cells that permanently exhibit short retention times are fairly easy to identify and …

The reach profiler (reaper) enabling the mitigation of dram retention failures via profiling at aggressive conditions

M Patel, JS Kim, O Mutlu - ACM SIGARCH Computer Architecture News, 2017 - dl.acm.org
Modern DRAM-based systems suffer from significant energy and latency penalties due to
conservative DRAM refresh standards. Volatile DRAM cells can retain information across a …

Cosmic rays don't strike twice: Understanding the nature of DRAM errors and the implications for system design

AA Hwang, IA Stefanovici, B Schroeder - ACM SIGPLAN Notices, 2012 - dl.acm.org
Main memory is one of the leading hardware causes for machine crashes in today's
datacenters. Designing, evaluating and modeling systems that are resilient against memory …

A study of DRAM failures in the field

V Sridharan, D Liberty - SC'12: Proceedings of the International …, 2012 - ieeexplore.ieee.org
Most modern computer systems use dynamic random access memory (DRAM) as a main
memory store. Recent publications have confirmed that DRAM errors are a common source …

ArchShield: Architectural framework for assisting DRAM scaling by tolerating high error rates

PJ Nair, DH Kim, MK Qureshi - ACM SIGARCH Computer Architecture …, 2013 - dl.acm.org
DRAM scaling has been the prime driver for increasing the capacity of main memory system
over the past three decades. Unfortunately, scaling DRAM to smaller technology nodes has …

PARBOR: An efficient system-level technique to detect data-dependent failures in DRAM

S Khan, D Lee, O Mutlu - 2016 46th Annual IEEE/IFIP …, 2016 - ieeexplore.ieee.org
System-level detection and mitigation of DRAM failures offer a variety of system
enhancements, such as better reliability, scalability, energy, and performance. Unfortunately …

An experimental study of data retention behavior in modern DRAM devices: Implications for retention time profiling mechanisms

J Liu, B Jaiyen, Y Kim, C Wilkerson… - ACM SIGARCH Computer …, 2013 - dl.acm.org
DRAM cells store data in the form of charge on a capacitor. This charge leaks off over time,
eventually causing data to be lost. To prevent this data loss from occurring, DRAM cells must …

DRAM errors in the wild: a large-scale field study

B Schroeder, E Pinheiro, WD Weber - ACM SIGMETRICS Performance …, 2009 - dl.acm.org
Errors in dynamic random access memory (DRAM) are a common form of hardware failure
in modern compute clusters. Failures are costly both in terms of hardware replacement costs …

Detecting and mitigating data-dependent DRAM failures by exploiting current memory content

S Khan, C Wilkerson, Z Wang, AR Alameldeen… - Proceedings of the 50th …, 2017 - dl.acm.org
DRAM cells in close proximity can fail depending on the data content in neighboring cells.
These failures are called data-dependent failures. Detecting and mitigating these failures …

DRAM errors in the wild: a large-scale field study

B Schroeder, E Pinheiro, WD Weber - Communications of the ACM, 2011 - dl.acm.org
Errors in dynamic random access memory (DRAM) are a common form of hardware failure
in modern compute clusters. Failures are costly both in terms of hardware replacement costs …