Addressing failures in exascale computing

M Snir, RW Wisniewski, JA Abraham… - … Journal of High …, 2014 - journals.sagepub.com
We present here a report produced by a workshop on 'Addressing failures in exascale
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …

[PDF][PDF] SD codes: erasure codes designed for how storage systems really fail.

JS Plank, M Blaum, JL Hafner - FAST, 2013 - usenix.org
Internet Backplane Protocol API and Applications Page 1 SD Codes: Erasure Codes Designed
for How Storage Systems Really Fail James S. Plank University of Tennessee USENIX FAST …

Sector-disk (SD) erasure codes for mixed failure modes in RAID systems

JS Plank, M Blaum - ACM Transactions on Storage (TOS), 2014 - dl.acm.org
Traditionally, when storage systems employ erasure codes, they are designed to tolerate the
failures of entire disks. However, the most common types of failures are latent sector failures …

[HTML][HTML] SWEP-RF: Accuracy sliding window-based ensemble pruning method for latent sector error prediction in cloud storage computing

A Tahir, F Chen, AA Almazroi, NF Janbi - Journal of King Saud University …, 2023 - Elsevier
Latent sector errors (LSEs) in disk drives cause significant outages, data loss, and
unreliability in large-scale cloud storage systems. Predicting LSEs can help avoid these …

Higher reliability redundant disk arrays: Organization, operation, and coding

A Thomasian, M Blaum - ACM Transactions on Storage (TOS), 2009 - dl.acm.org
Parity is a popular form of data protection in redundant arrays of inexpensive/independent
disks (RAID). RAID5 dedicates one out of N disks to parity to mask single disk failures, that …

Towards securing data transfers against silent data corruption

B Charyyev, A Alhussen, H Sapkota… - 2019 19th IEEE/ACM …, 2019 - ieeexplore.ieee.org
Scientific applications generate large volumes of data that often needs to be moved between
geographically distributed sites for collaboration or backup which has led to a significant …

[PDF][PDF] A Framework for Software Preservation.

B Matthews, A Shaon, J Bicarregui… - Int. J. Digit. Curation, 2010 - epubs.stfc.ac.uk
Software preservation has not had detailed consideration as a research topic or in practical
application. In this report, we first discuss some of the motivations and problems of software …

[HTML][HTML] Bit preservation: A solved problem?

DSH Rosenthal - iPRES 2008, 2008 - digipres.org
For years, discussions of digital preservation have routinely featured comments such as “bit
preservation is a solved problem; the real issues are...”. Indeed, current digital storage …

RIVA: Robust integrity verification algorithm for high-speed file transfers

B Charyyev, E Arslan - IEEE Transactions on Parallel and …, 2020 - ieeexplore.ieee.org
End-to-end integrity verification is designed to protect file transfers against silent data
corruption by comparing checksum of files at source and destination end points using …

Evaluating the impact of undetected disk errors in raid systems

EWD Rozier, W Belluomini… - 2009 IEEE/IFIP …, 2009 - ieeexplore.ieee.org
Despite the reliability of modern disks, recent studies have made it clear that a new class of
faults, UndetectedDisk Errors (UDEs) also known as silent data corruption events, become a …