Modeling, optimization and performance prediction of parallel algorithms

M Hudik, M Hodoň - 2014 IEEE Symposium on Computers and …, 2014 - ieeexplore.ieee.org
The high intensity of research and modeling in fields of mathematics, physics, biology and
chemistry requires new computing resources. For the big computational complexity of such …

Solving system of linear equations in a Network of Workstations

G Dimitriu, F Ionescu - 2006 Fifth International Symposium on …, 2006 - ieeexplore.ieee.org
In this article we propose an evaluation of the three common algorithms for solving linear
system of equations: Gauss elimination, Gauss-Jordan without pivoting and Jacobi with …

Developing a high performance software library with MPI and CUDA for matrix computations

B Oancea, T Andrei - arXiv preprint arXiv:1511.07174, 2015 - ceeol.com
Nowadays, the paradigm of parallel computing is changing. CUDA is now a popular
programming model for general purpose computations on GPUs and a great number of …

Prediction of parallel algorithms performance on bus-based networks using pvm

RJ Rodriguez, C Almeida… - Proceedings of the Sixth …, 1998 - ieeexplore.ieee.org
We adapt the classic sequential model used to predict the performance of communications
on parallel computers to a local area network using PVM. We have discovered that the linear …

On the processing time of a parallel linear system solver

A Stafylopatis, A Drigas - International Conference on Supercomputing, 1987 - Springer
The speed-up obtained by the use of multiprocessor systems is of major importance for
numerical applications involving the solution of large dense systems of linear equations. We …

A Parallel Algorithm for the Solution of the Deconvolution Problem on Heterogeneous Networks

P Alonso, AM Vidal… - 2006 IEEE International …, 2006 - ieeexplore.ieee.org
In this work we present a parallel algorithm for the solution of a least squares problem with
structured matrices. This problem arises in many applications mainly related to digital signal …

Fast matrix multiplication in dynamic SMP clusters with communication on the fly in systems on chip technology

M Tudruj, L Masko - International Symposium on Parallel …, 2006 - ieeexplore.ieee.org
This paper concerns numerical computations in a new shared memory system architecture
oriented towards systems on chip technology. Dynamically reconfigurable processor …

[图书][B] Parallel Processing and Applied Mathematics: 11th International Conference, PPAM 2015, Krakow, Poland, September 6-9, 2015. Revised Selected Papers …

R Wyrzykowski, E Deelman, J Dongarra, K Karczewski… - 2016 - books.google.com
This two-volume set LNCS 9573 and LNCS 9574 constitutes the refereed proceedings of the
11th International Conference of Parallel Processing and Applied Mathematics, PPAM 2015 …

Incorporating memory layout in the modeling of message passing programs

FJ Seinstra, D Koelma - Proceedings 10th Euromicro Workshop …, 2002 - ieeexplore.ieee.org
One of the most fundamental tasks of an automatic parallelization tool is to find an optimal
domain decomposition for a given application. For regular domain problems (such as simple …

Methods to utilize SIMT and SIMD instruction level parallelism in tridiagonal solvers

E László, MB Giles, J Appleyard… - 2014 14th international …, 2014 - ieeexplore.ieee.org
The most widely used parallel architectures in today's High Performance Computing
systems utilize multi-core CPUs, many-core GPUs or Intel's MIC (Many Integrated Core). The …