K Wu, J Ren, D Li - SC18: International Conference for High …, 2018 - ieeexplore.ieee.org
Non-volatile memory (NVM) provides a scalable solution to replace DRAM as main memory. Because of relatively high latency and low bandwidth of NVM (comparing with DRAM), NVM …
We expect that the size and the complexity of future supercomputers will increase on their path to exascale systems and beyond. Therefore, system software has to adapt to the …
W Gropp, M Snir - Computing in Science & Engineering, 2013 - ieeexplore.ieee.org
Exascale systems will present programmers with many challenges. The authors review the parallel programming models that are appropriate for such systems and the challenges that …
The now commonplace multi-core chips have introduced, by design, a deep hierarchy of memory and cache banks within parallel computers as a tradeoff between the user …
L Stanisic, S Thibault, A Legrand… - Concurrency and …, 2015 - Wiley Online Library
Multi‐core architectures comprising several graphics processing units (GPUs) have become mainstream in the field of high‐performance computing. However, obtaining the maximum …
EHM da Cruz, MAZ Alves, A Carissimi… - … on Parallel and …, 2011 - ieeexplore.ieee.org
In parallel programs, the tasks of a given application must cooperate in order to accomplish the required computation. However, the communication time between the tasks may be …
Performance degradation due to nonuniform data access latencies has worsened on NUMA systems and can now be felt on‐chip in manycore processors. Distributing data across …
S Pophale, O Hernandez - OpenMP: Memory, Devices, and Tasks: 12th …, 2016 - Springer
As we move toward pre-Exascale systems, two of the DOE leadership class systems will consist of very powerful OpenPOWER compute nodes which will be more complex to …
We present an approach to improving data locality across different phases of fork/join programs scheduled using work stealing. The approach consists of:(1) user-specified and …