A survey of techniques for modeling and improving reliability of computing systems

S Mittal, JS Vetter - IEEE Transactions on Parallel and …, 2015 - ieeexplore.ieee.org
Recent trends of aggressive technology scaling have greatly exacerbated the occurrences
and impact of faults in computing systems. This has madereliability'a first-order design …

Demystifying the system vulnerability stack: Transient fault effects across the layers

G Papadimitriou, D Gizopoulos - 2021 ACM/IEEE 48th Annual …, 2021 - ieeexplore.ieee.org
In this paper, we revisit the system vulnerability stack for transient faults. We reveal severe
pitfalls in widely used vulnerability measurement approaches, which separate the hardware …

Avgi: Microarchitecture-driven, fast and accurate vulnerability assessment

G Papadimitriou, D Gizopoulos - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
We propose AVGI, a new Statistical Fault Injection (SFI)-based methodology, which delivers
orders of magnitude faster assessment of the Architectural Vulnerability Factor (AVF) of a …

Soft error effects on arm microprocessors: Early estimations versus chip measurements

PR Bodmann, G Papadimitriou… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Extensive research efforts are being carried out to evaluate and improve the reliability of
computing devices either through beam experiments or simulation-based fault injection …

Demystifying soft error assessment strategies on arm cpus: Microarchitectural fault injection vs. neutron beam experiments

A Chatzidimitriou, P Bodmann… - 2019 49th Annual …, 2019 - ieeexplore.ieee.org
Fault injection in early microarchitecture-level simulation CPU models and beam
experiments on the final physical CPU chip are two established methodologies to access the …

Silent data corruptions: Microarchitectural perspectives

G Papadimitriou, D Gizopoulos - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Today more than ever before, academia, manufacturers, and hyperscalers acknowledge the
major challenge of silent data corruptions (SDCs) and aim on solutions to minimize its …

Differential fault injection on microarchitectural simulators

M Kaliorakis, S Tselonis… - 2015 IEEE …, 2015 - ieeexplore.ieee.org
Fault injection on micro architectural structures modeled in performance simulators is an
effective method for the assessment of microprocessors reliability in early design stages …

Impact of voltage scaling on soft errors susceptibility of multicore server cpus

D Agiakatsikas, G Papadimitriou, V Karakostas… - Proceedings of the 56th …, 2023 - dl.acm.org
Microprocessor power consumption and dependability are both crucial challenges that
designers have to cope with due to shrinking feature sizes and increasing transistor counts …

Design and evaluation of buffered triple modular redundancy in interleaved-multi-threading processors

M Barbirotta, A Cheikh, A Mastrandrea… - IEEE …, 2022 - ieeexplore.ieee.org
Fault management in digital chips is a crucial aspect of functional safety. Significant work
has been done on gate and microarchitecture level triple modular redundancy, and on …

Evaluation of dynamic triple modular redundancy in an interleaved-multi-threading risc-v core

M Barbirotta, A Cheikh, A Mastrandrea… - Journal of Low Power …, 2022 - mdpi.com
Functional safety is a key requirement in several application domains in which
microprocessors are an essential part. A number of redundancy techniques have been …