Improving scalability with GPU-aware asynchronous tasks

J Choi, DF Richards, LV Kale - 2022 IEEE International Parallel …, 2022 - ieeexplore.ieee.org
Asynchronous tasks, when created with over-decomposition, enable automatic computation-
communication overlap which can substantially improve performance and scal-ability. This …

Design and implementation of the PALM-3000 real-time control system

TN Truong, AH Bouchez, RS Burruss… - … Optics Systems III, 2012 - spiedigitallibrary.org
This paper reflects, from a computational perspective, on the experience gathered in
designing and implementing realtime control of the PALM-3000 adaptive optics system …

Distributing and load balancing sparse fluid simulations

C Shah, D Hyde, H Qu, P Levis - Computer Graphics Forum, 2018 - Wiley Online Library
This paper describes a general algorithm and a system for load balancing sparse fluid
simulations. Automatically distributing sparse fluid simulations efficiently is challenging …

Scalable heterogeneous computing with asynchronous message-driven execution

J Choi - 2022 - ideals.illinois.edu
Computer systems today are becoming increasingly heterogeneous, in response to
increasingly demanding performance requirements of both traditional and emerging …

Strategies to hide communication for a classical molecular dynamics proxy application

I Ngatang, M Sosonkina - Proceedings of the Symposium on High …, 2015 - dl.acm.org
Co-designing applications and computer architectures has become of major importance due
to the growing complexity of both applications and architectures and the need to better …

2014 Runtime Systems Summit. Runtime Systems Report

V Sarkar, Z Budimlic, M Kulkani - 2016 - osti.gov
This report summarizes runtime system challenges for exascale computing, that follow from
the fundamental challenges for exascale systems that have been well studied in past …

Improving Hardware Performance via Non-Blocking Collective Communications and Computation Reordering for All-Pairs Shortest Path Computation on the Cell …

EA Colmenares - 2014 - ttu-ir.tdl.org
Most of the scientific problems being faced by researchers do not have embarrassingly
parallel solutions. This means that dynamic data sharing among multiple participant cores is …

Analysis and visualization of communication/computation patterns of high-performance applications

E Llamosí Santamans - 2014 - upcommons.upc.edu
High Performance Computing (HPC) is a branch of computer science dedicated to the
design and development of supercomputers and the software that runs on them. These …

Separating implementation concerns in stencil computations for semiregular grids

A Stone - 2013 - search.proquest.com
In atmospheric and ocean simulation programs, stencil computations occur on semiregular
grids where subdomains of the grid are regular (ie stored in an array), but boundaries …

[PDF][PDF] Dissertation Proposal: Abstractions for, and Generation of, Semi-Regular Grid Computations

A Stone - 2012 - astonewebsite.s3.amazonaws.com
In various applications including atmospheric and ocean simulation programs, stencil
computations occur on grids where sub-domains of the grid are regular (eg, can be stored in …