hwloc: A generic framework for managing hardware affinities in HPC applications

F Broquedis, J Clet-Ortega, S Moreaud… - 2010 18th Euromicro …, 2010 - ieeexplore.ieee.org
The increasing numbers of cores, shared caches and memory nodes within machines
introduces a complex hardware topology. High-performance computing applications now …

[PDF][PDF] Cloud computing for parallel scientific hpc applications: Feasibility of running coupled atmosphere-ocean climate models on amazons ec2

C Evangelinos, C Hill - ratio, 2008 - academia.edu
In this article we describe the application of HPC standard benchmark tests to Amazon's
EC2 cloud computing system, in order to explore the utility of EC2 for modest HPC style …

A GPGPU transparent virtualization component for high performance computing clouds

G Giunta, R Montella, G Agrillo, G Coviello - Euro-Par 2010-Parallel …, 2010 - Springer
Abstract The GPU Virtualization Service (gVirtuS) presented in this work tries to fill the gap
between in-house hosted computing clusters, equipped with GPGPUs devices, and pay-for …

Understanding the impact of multi-core architecture in cluster computing: A case study with intel dual-core system

L Chai, Q Gao, DK Panda - … on cluster computing and the grid …, 2007 - ieeexplore.ieee.org
Multi-core processors are growing as a new industry trend as single core processors rapidly
reach the physical limits of possible complexity and speed. In the new Top500 …

GASNet-EX: A high-performance, portable communication library for exascale

D Bonachea, PH Hargrove - … Workshop on Languages and Compilers for …, 2018 - Springer
Abstract Partitioned Global Address Space (PGAS) models, typified by languages such as
Unified Parallel C (UPC) and Co-Array Fortran, expose one-sided communication as a key …

Virtual machine aware communication libraries for high performance computing

W Huang, MJ Koop, Q Gao, DK Panda - Proceedings of the 2007 ACM …, 2007 - dl.acm.org
As the size and complexity of modern computing systems keep increasing to meet the
demanding requirements of High Performance Computing (HPC) applications …

Integrating asynchronous task parallelism with MPI

S Chatterjee, S Tasırlar, Z Budimlic… - 2013 IEEE 27th …, 2013 - ieeexplore.ieee.org
Effective combination of inter-node and intra-node parallelism is recognized to be a major
challenge for future extreme-scale systems. Many researchers have demonstrated the …

Designing high performance and scalable MPI intra-node communication support for clusters

L Chai, A Hartono, DK Panda - 2006 IEEE International …, 2006 - ieeexplore.ieee.org
As new processor and memory architectures advance, clusters start to be built from larger
SMP systems, which makes MPI intra-node communication a critical issue in high …

Cache-efficient, intranode, large-message MPI communication with MPICH2-Nemesis

D Buntinas, B Goglin, D Goodell… - 2009 International …, 2009 - ieeexplore.ieee.org
The emergence of multicore processors raises the need to efficiently transfer large amounts
of data between local processes. MPICH2 is a highly portable MPI implementation whose …

Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the Nemesis communication subsystem

D Buntinas, G Mercier, W Gropp - Parallel Computing, 2007 - Elsevier
This paper presents the implementation of MPICH2 over the Nemesis communication
subsystem and the evaluation of its shared-memory performance. We describe design …