A survey on modeling and improving reliability of DNN algorithms and accelerators

S Mittal - Journal of Systems Architecture, 2020 - Elsevier
As DNNs become increasingly common in mission-critical applications, ensuring their
reliable operation has become crucial. Conventional resilience techniques fail to account for …

Soft errors in DNN accelerators: A comprehensive review

Y Ibrahim, H Wang, J Liu, J Wei, L Chen, P Rech… - Microelectronics …, 2020 - Elsevier
Deep learning tasks cover a broad range of domains and an even more extensive range of
applications, from entertainment to extremely safety-critical fields. Thus, Deep Neural …

Reliable data transmission model for mobile ad hoc network using signcryption technique

M Elhoseny, K Shankar - IEEE transactions on reliability, 2019 - ieeexplore.ieee.org
In recent years, the need for high security with reliability in the wireless network has
tremendously been increased. To provide high security in reliable networks, mobile ad hoc …

Testability and dependability of AI hardware: Survey, trends, challenges, and perspectives

F Su, C Liu, HG Stratigopoulos - IEEE Design & Test, 2023 - ieeexplore.ieee.org
Hardware realization of artificial intelligence (AI) requires new design styles and even
underlying technologies than those used in traditional digital processors or logic circuits …

A systematic literature review on hardware reliability assessment methods for deep neural networks

MH Ahmadilivani, M Taheri, J Raik… - ACM Computing …, 2024 - dl.acm.org
Artificial Intelligence (AI) and, in particular, Machine Learning (ML), have emerged to be
utilized in various applications due to their capability to learn how to solve complex …

A low-cost fault corrector for deep neural networks through range restriction

Z Chen, G Li, K Pattabiraman - 2021 51st Annual IEEE/IFIP …, 2021 - ieeexplore.ieee.org
The adoption of deep neural networks (DNNs) in safety-critical domains has engendered
serious reliability concerns. A prominent example is hardware transient faults that are …

GPU devices for safety-critical systems: A survey

J Perez-Cerrolaza, J Abella, L Kosmidis… - ACM Computing …, 2022 - dl.acm.org
Graphics Processing Unit (GPU) devices and their associated software programming
languages and frameworks can deliver the computing performance required to facilitate the …

Making convolutions resilient via algorithm-based error detection techniques

SKS Hari, MB Sullivan, T Tsai… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Convolutional Neural Networks (CNNs) are being increasingly used in safety-critical and
high-performance computing systems. As such systems require high levels of resilience to …

RUL prediction using a fusion of attention-based convolutional variational autoencoder and ensemble learning classifier

I Remadna, LS Terrissa, Z Al Masry… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Predicting the remaining useful life (RUL) is a critical step before the decision-making
process and developing maintenance strategies. As a result, it is frequently impacted by …

[HTML][HTML] FlexGripPlus: An improved GPGPU model to support reliability analysis

JER Condia, B Du, MS Reorda, L Sterpone - Microelectronics Reliability, 2020 - Elsevier
Abstract General Purpose Graphics Processing Units (GPGPUs) have been extensively
used in the last decade as accelerators in high demanding applications, such as multimedia …