Cross-Layer Fault Analysis for Microprocessor Architectures (CLAM)

I Alshaer - 2023 - theses.hal.science
With the widespread use of embedded system devices, hardware designers and software
developers started paying more attention to security issues in order to protect these devices …

Understanding soft errors in uncore components

H Cho, CY Cher, T Shepherd, S Mitra - Proceedings of the 52Nd Annual …, 2015 - dl.acm.org
The effects of soft errors in processor cores have been widely studied. However, little has
been published about soft errors in uncore components, such as memory subsystem and I/O …

Understanding reliability implication of hardware error in virtualization infrastructure

X Xu, HH Huang - 10th Workshop on Hot Topics in System Dependability …, 2014 - usenix.org
Hardware errors are no longer the exceptions in modern cloud data centers. Although
virtualization provides software failure isolation across different virtual machines (VM), the …

Exploring the impact of soft errors on the reliability of real-time embedded operating systems

S Azimi, C De Sio, A Portaluri, D Rizzieri, E Vacca… - Electronics, 2022 - mdpi.com
The continuous scaling of electronic components has led to the development of high-
performance microprocessors that are suitable even for safety-critical applications where …

Efficient error-detection and recovery mechanisms for reliability and resiliency of multicores

S Kundu, O Khan - 2016 29th International Conference on VLSI …, 2016 - ieeexplore.ieee.org
With increasing density of power, traditional frequency scaling of processors came to an
end. The power wall forced the industry to seek performance from parallel processing …

PLR: A software approach to transient fault tolerance for multicore architectures

A Shye, J Blomstedt, T Moseley… - … on Dependable and …, 2008 - ieeexplore.ieee.org
Transient faults are emerging as a critical concern in the reliability of general-purpose
microprocessors. As architectural trends point toward multicore designs, there is substantial …

[PDF][PDF] Robustness evaluation of operating systems

A Johansson - 2008 - tuprints.ulb.tu-darmstadt.de
The premise behind this thesis is the observation that Operating Systems (OS), being the
foundation behind operations of computing systems, are complex entities and also subject to …

Using unreliable virtual hardware to inject errors in extreme-scale systems

S Levy, MGF Dosanjh, PG Bridges… - Proceedings of the 3rd …, 2013 - dl.acm.org
Fault tolerance is a key obstacle to next generation extreme-scale systems. As systems
scale, the Mean Time To Interrupt (MTTI) decreases proportionally. As a result, extreme …

[PDF][PDF] Error Sensitivity of the Linux Kernel Executing on PowerPC G4 and Pentium 4 Processors.

W Gu, Z Kalbarczyk, RK Iyer - DSN, 2004 - researchgate.net
The goals of this study are:(i) to compare Linux kernel (2.4. 22) behavior under a broad
range of errors on two target processors—the Intel Pentium 4 (P4) running RedHat Linux 9.0 …

InstantOps: A Joint Approach to System Failure Prediction and Root Cause Identification in Microserivces Cloud-Native Applications

R Rouf, M Rasolroveicy, M Litoiu, S Nagar… - Proceedings of the 15th …, 2024 - dl.acm.org
As microservice and cloud computing operations increasingly adopt automation, the
importance of models for fostering resilient and efficient adaptive architectures becomes …