Demystifying the system vulnerability stack: Transient fault effects across the layers

G Papadimitriou, D Gizopoulos - 2021 ACM/IEEE 48th Annual …, 2021 - ieeexplore.ieee.org
In this paper, we revisit the system vulnerability stack for transient faults. We reveal severe
pitfalls in widely used vulnerability measurement approaches, which separate the hardware …

Fidelity: Efficient resilience analysis framework for deep learning accelerators

Y He, P Balaprakash, Y Li - 2020 53rd Annual IEEE/ACM …, 2020 - ieeexplore.ieee.org
We present a resilience analysis framework, called FIdelity, to accurately and quickly
analyze the behavior of hardware errors in deep learning accelerators. Our framework …

Understanding and mitigating hardware failures in deep learning training systems

Y He, M Hutton, S Chan, R De Gruijl… - Proceedings of the 50th …, 2023 - dl.acm.org
Deep neural network (DNN) training workloads are increasingly susceptible to hardware
failures in datacenters. For example, Google experienced" mysterious, difficult to identify …

Data masking techniques for NoSQL database security: A systematic review

A Cuzzocrea, H Shahriar - … conference on big data (Big Data), 2017 - ieeexplore.ieee.org
This paper first presents an in-depth study of potential security vulnerabilities in MongoDB
and Cassandra, two popular NoSQL databases. We provide examples of attacks. We then …

Avgi: Microarchitecture-driven, fast and accurate vulnerability assessment

G Papadimitriou, D Gizopoulos - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
We propose AVGI, a new Statistical Fault Injection (SFI)-based methodology, which delivers
orders of magnitude faster assessment of the Architectural Vulnerability Factor (AVF) of a …

Hardnn: Feature map vulnerability evaluation in cnns

A Mahmoud, SKS Hari, CW Fletcher, SV Adve… - arXiv preprint arXiv …, 2020 - arxiv.org
As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical
applications, it is important that they behave reliably in the face of hardware errors. Transient …

Fault site pruning for practical reliability analysis of GPGPU applications

B Nie, L Yang, A Jog, E Smirni - 2018 51st Annual IEEE/ACM …, 2018 - ieeexplore.ieee.org
Graphics Processing Units (GPUs) have rapidly evolved to enable energy-efficient data-
parallel computing for a broad range of scientific areas. While GPUs achieve exascale …

Anatomy of on-chip memory hardware fault effects across the layers

G Papadimitriou, D Gizopoulos - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Reliability evaluation of a microprocessor design may reveal vulnerable silicon areas that
require protection against faults, but also hardware structures that are inherently more …

The Arm triple core lock-step (TCLS) processor

X Iturbe, B Venu, E Ozer, JL Poupat… - ACM Transactions on …, 2019 - dl.acm.org
The Arm Triple Core Lock-Step (TCLS) architecture is the natural evolution of Arm Cortex-R
Dual Core Lock-Step (DCLS) processors to increase dependability, predictability, and …

An empirical study of the impact of single and multiple bit-flip errors in programs

B Sangchoolie, K Pattabiraman… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Recent studies have shown that technology and voltage scaling are expected to increase
the likelihood that particle-induced soft errors manifest as multiple-bit errors. This raises …