Report of the HPC Correctness Summit, Jan 25--26, 2017, Washington, DC

G Gopalakrishnan, PD Hovland, C Iancu… - arXiv preprint arXiv …, 2017 - arxiv.org
Maintaining leadership in HPC requires the ability to support simulations at large scales and
fidelity. In this study, we detail one of the most significant productivity challenges in …

A domain-specific compiler for linear algebra operations

D Fabregat-Traver, P Bientinesi - … Conference, Kope, Japan, July 17-20 …, 2013 - Springer
We present a prototypical linear algebra compiler that automatically exploits domain-specific
knowledge to generate high-performance algorithms. The input to the compiler is a target …

Application-tailored linear algebra algorithms: A search-based approach

D Fabregat-Traver, P Bientinesi - The International journal …, 2013 - journals.sagepub.com
In this paper, we tackle the problem of automatically generating algorithms for linear algebra
operations by taking advantage of problem-specific knowledge. In most situations, users …

Modeling of languages for tensor manipulation

NA Rink - arXiv preprint arXiv:1801.08771, 2018 - arxiv.org
Numerical applications and, more recently, machine learning applications rely on high-
dimensional data that is typically organized into multi-dimensional tensors. Many existing …

Solving tensor structured problems with computational tensor algebra

O Morozov, P Hunziker - arXiv preprint arXiv:1001.5460, 2010 - arxiv.org
Since its introduction by Gauss, Matrix Algebra has facilitated understanding of scientific
problems, hiding distracting details and finding more elegant and efficient ways of …

Automating Data-layout Decisions in Domain-specific Languages

D Deb - 2019 - search.proquest.com
A long-standing challenge in High-Performance Computing (HPC) is the simultaneous
achievement of programmer productivity and hardware computational efficiency. The …

[PDF][PDF] The super instruction processor parallel design pattern for data and floating point intensive algorithms

V Lotrich, M Ponton, L Wang, A Yau… - Workshop on Patterns …, 2005 - academia.edu
ABSTRACT A design pattern is considered in which a distributed memory, multi processor
computer is viewed as an early generation processor. In such processors, each instruction …

Analysis and Compilation of Parallel Programming Languages

A Susungi - 2018 - pastel.hal.science
Traditional compilation faces numerous challenges with program optimizations for parallel
architectures. A particular challenge is the design of proper intermediate languages and …

Report of the HPC Correctness Summit, January 25-26, 2017, Washington, DC

G Gopalakrishnan, PD Hovland, C Iancu… - 2017 - osti.gov
Technologies for verification and debugging have made significant strides in the context of
general systems software. An investment in such technologies to make them applicable for …

Parallel generalized tensor multiplication

C Kavaklıoğlu, AT Cemgil - 2012 20th Signal Processing and …, 2012 - ieeexplore.ieee.org
Tensor factorization is a frequently used modelling tool in problems involving large amounts
of n-way data. Probabilistic Latent Tensor Factorization framework provides a probabilistic …