Extending openSHMEM for GPU computing

H Wang, S Potluri, D Bureddy… - … on Parallel and …, 2013 - ieeexplore.ieee.org

Designing high-performance and scalable applications on GPU clusters requires tackling
several challenges. The key challenge is the separate host memory and device memory …

被引用次数：124 相关文章所有 4 个版本

[PDF] acm.org

Multi-GPU Communication Schemes for Iterative Solvers: When CPUs Are Not in Charge

I Ismayilov, J Baydamirli, D Sağbili, M Wahib… - Proceedings of the 37th …, 2023 - dl.acm.org

This paper proposes a fully autonomous execution model for multi-GPU applications that
completely excludes the involvement of the CPU beyond the initial kernel launch. In a typical …

被引用次数：12 相关文章

[PDF] xcalablemp.org

XcalableACC: Extension of XcalableMP PGAS language using OpenACC for accelerator clusters

M Nakao, H Murai, T Shimosaka… - 2014 First Workshop …, 2014 - ieeexplore.ieee.org

The present paper introduces the XcalableACC (XACC) programming model, which is a
hybrid model of the XcalableMP (XMP) Partitioned Global Address Space (PGAS) language …

被引用次数：63 相关文章所有 11 个版本

[PDF] osti.gov

Gpu-centric communication on nvidia gpu clusters with infiniband: A case study with openshmem

S Potluri, A Goswami, D Rossetti… - 2017 IEEE 24th …, 2017 - ieeexplore.ieee.org

GPUs have become an essential component for building compute clusters with high
compute density and high performance per watt. As such clusters scale to have 1000s of …

被引用次数：38 相关文章所有 4 个版本

[PDF] researchgate.net

InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

L Oden, H Fröning - The International Journal of High …, 2017 - journals.sagepub.com

Due to their massive parallelism and high performance per Watt, GPUs have gained high
popularity in high-performance computing and are a strong candidate for future exascale …

被引用次数：46 相关文章所有 10 个版本

Exploiting GPUDirect RDMA in designing high performance OpenSHMEM for NVIDIA GPU clusters

K Hamidouche, A Venkatesh, AA Awan… - 2015 IEEE …, 2015 - ieeexplore.ieee.org

GPUDirect RDMA (GDR) brings the high-performance communication capabilities of RDMA
networks like InfiniBand (IB) to GPUs (referred to as" Device"). It enables IB network …

被引用次数：31 相关文章所有 4 个版本

[PDF] netlib.org

Gpu-aware non-contiguous data movement in open mpi

W Wu, G Bosilca, R Vandevaart, S Jeaugey… - Proceedings of the 25th …, 2016 - dl.acm.org

Due to better parallel density and power efficiency, GPUs have become more popular for
use in scientific applica-tions. Many of these applications are based on the ubiquitous …

被引用次数：25 相关文章所有 5 个版本

[PDF] ieee.org

MPI-ACC: accelerator-aware MPI for scientific applications

AM Aji, LS Panwar, F Ji, K Murthy… - IEEE transactions on …, 2015 - ieeexplore.ieee.org

Data movement in high-performance computing systems accelerated by graphics
processing units (GPUs) remains a challenging problem. Data communication in popular …

被引用次数：28 相关文章所有 14 个版本

[PDF] springer.com

A novel approach for big data processing using message passing interface based on memory mapping

SA Dheyab, MN Abdullah, BF Abed - Journal of Big Data, 2019 - Springer

The analysis and processing of big data are one of the most important challenges that
researchers are working on to find the best approaches to handle it with high performance …

被引用次数：14 相关文章所有 9 个版本

[PDF] researchgate.net

Energy-efficient collective reduce and allreduce operations on distributed GPUs

L Oden, B Klenk, H Fröning - 2014 14th IEEE/ACM …, 2014 - ieeexplore.ieee.org

GPUs gain high popularity in High Performance Computing, due to their massive parallelism
and high performance per Watt. Despite their popularity, data transfer between multiple …

被引用次数：30 相关文章所有 7 个版本

高级搜索

QQ 群