Analyzing put/get apis for thread-collaborative processors

B Klenk, L Oden, H Froening - 2014 43rd International …, 2014 - ieeexplore.ieee.org
2014 43rd International Conference on Parallel Processing Workshops, 2014ieeexplore.ieee.org
In High-Performance Computing (HPC), GPU-based accelerators are pervasive for two
reasons: first, GPUs provide a much higher raw computational power than traditional CPUs.
Second, power consumption increases sub-linearly with the performance increase, making
GPUs much more energy-efficient in terms of GFLOPS/Watt than CPUs. Although these
advantages are limited to a selected set of workloads, most HPC applications can benefit a
lot from GPUs. The top 11 entries of the current Green500 list (November 2013) are all GPU …
In High-Performance Computing (HPC), GPU-based accelerators are pervasive for two reasons: first, GPUs provide a much higher raw computational power than traditional CPUs. Second, power consumption increases sub-linearly with the performance increase, making GPUs much more energy-efficient in terms of GFLOPS/Watt than CPUs. Although these advantages are limited to a selected set of workloads, most HPC applications can benefit a lot from GPUs. The top 11 entries of the current Green500 list (November 2013) are all GPU-accelerated systems, which supports the previous statements. For system architects the use of GPUs is challenging though, as their architecture is based on thread-collaborative execution and differs significantly from CPUs, which are mainly optimized for single-thread performance. The interfaces to other devices in a system, in particular the network device, are still solely optimized for CPUs. This makes GPU-controlled IO a challenge, although it is desirable for savings in terms of energy and time. This is especially true for network devices, which are a key component in HPC systems. In previous work we have shown that GPUs can directly source and sink network traffic for Infiniband devices without any involvement of the host CPUs, but this approach does not provide any performance benefits. Here we explore another API for Put/Get operations that can overcome some limitations. In particular, we provide a detailed reasoning about the issues that prevent performance advantages when directly controlling IO from the GPU domain.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果