Workshop report on basic research needs for scientific machine learning: Core technologies for artificial intelligence

N Baker, F Alexander, T Bremer, A Hagberg… - 2019 - osti.gov
Scientific Machine Learning (SciML) and Artificial Intelligence (AI) will have broad use and
transformative effects across the Department of Energy. Accordingly, the January 2018 Basic …

Big data analytics: Machine learning and Bayesian learning perspectives—What is done? What is not?

S Suthaharan - Wiley Interdisciplinary Reviews: Data Mining …, 2019 - Wiley Online Library
Big data analytics provides an interdisciplinary framework that is essential to support the
current trend for solving real‐world problems collaboratively. The progression of big data …

Resiliency in numerical algorithm design for extreme scale simulations

E Agullo, M Altenbernd, H Anzt… - … Journal of High …, 2022 - journals.sagepub.com
This work is based on the seminar titled 'Resiliency in Numerical Algorithm Design for
Extreme Scale Simulations' held March 1–6, 2020, at Schloss Dagstuhl, that was attended …

Characterization of the impact of soft errors on iterative methods

BO Mutlu, G Kestor, J Manzano, O Unsal… - 2018 IEEE 25th …, 2018 - ieeexplore.ieee.org
Soft errors caused by transient bit flips have the potential to significantly impact an
application's behavior. This has motivated the design of an array of techniques to detect …

Towards end-to-end sdc detection for hpc applications equipped with lossy compression

S Li, S Di, K Zhao, X Liang, Z Chen… - … Conference on Cluster …, 2020 - ieeexplore.ieee.org
Data reduction techniques have been widely demanded and used by large-scale high
performance computing (HPC) applications because of vast volumes of data to be produced …

Predicting the silent data corruption vulnerability of instructions in programs

N Yang, Y Wang - 2019 IEEE 25th International Conference on …, 2019 - ieeexplore.ieee.org
With the decreasing size and voltage level of internal device components, soft errors are
increasing and constitute a major threat on electronic devices. Silent data corruption (SDC) …

Ground-truth prediction to accelerate soft-error impact analysis for iterative methods

BO Mutlu, G Kestor, A Cristal, O Unsal… - 2019 IEEE 26th …, 2019 - ieeexplore.ieee.org
Understanding the impact of soft errors on applications can be expensive. Often, it requires
an extensive error injection campaign involving numerous runs of the full application in the …

Efficient detection of silent data corruption in HPC applications with synchronization-free message verification

G Zhang, Y Liu, H Yang, D Qian - The Journal of Supercomputing, 2022 - Springer
Nowadays, high-performance computing (HPC) is stepping forward to exascale era.
However, silent data corruption (SDC) behaved as bit-flipping can cause disastrous …

FPDetect Efficient Reasoning About Stencil Programs Using Selective Direct Evaluation

A Das, S Krishnamoorthy, I Briggs… - ACM Transactions on …, 2020 - dl.acm.org
We present FPDetect, a low-overhead approach for detecting logical errors and soft errors
affecting stencil computations without generating false positives. We develop an offline …

[图书][B] Soft Error Reliability Using Virtual Platforms: Early Evaluation of Multicore Systems

FR da Rosa, L Ost, R Reis - 2020 - books.google.com
This book describes the benefits and drawbacks inherent in the use of virtual platforms (VPs)
to perform fast and early soft error assessment of multicore systems. The authors show that …