area to more versatile setups. However, the lack of a regular topology limits the scalability of
distributed parallel methods, especially for routines that perform a physical search in space.
One of the most prominent slowdowns is the search for halo elements in physical space for
the purpose of runtime communication avoidance. In this work, we present a new
communication-free halo element search algorithm utilizing the MPI-3 shared memory …