G Sun, S Kang, SW Jun - ACM Transactions on Reconfigurable …, 2022 - dl.acm.org
We present BurstZ+, an accelerator platform that eliminates the communication bottleneck between PCIe-attached scientific computing accelerators and their host servers, via …
Graphics Processing Units (GPUs) are used as general purpose parallel accelerators in a wide range of applications. They are found in most computing systems, and mobile devices …
The trend towards specialization of software and hardware-fuelled by the end of Moore's law and the still accelerating interest in domain-specific computing, such as machine learning …
T Koehler, M Steuwer - 2021 IEEE/ACM International …, 2021 - ieeexplore.ieee.org
Halide and many similar projects have demonstrated the great potential of domain specific optimizing compilers. They enable programs to be expressed at a convenient high-level …
A Rasch - arXiv preprint arXiv:2405.05118, 2024 - arxiv.org
We formally introduce a systematic (de/re)-composition approach, based on the algebraic formalism of" Multi-Dimensional Homomorphisms (MDHs)". Our approach is designed as …
G Sun, S Kang, SW Jun - Proceedings of the 34th ACM International …, 2020 - dl.acm.org
We present BurstZ, a bandwidth-efficient accelerator platform for scientific computing. While accelerators such as GPUs and FPGAs provide enormous computing capabilities, their …
Embedded software is found everywhere from our highly visible mobile devices to the confines of our car in the form of smart sensors. Embedded software companies are under …
U Beaugnon, A Pouille, M Pouzet, J Pienaar… - Proceedings of the 26th …, 2017 - dl.acm.org
Many computationally-intensive algorithms benefit from the wide parallelism offered by Graphical Processing Units (GPUs). However, the search for a close-to-optimal …
Stencil computations are a widely used type of algorithm, found in applications from physical simulations to machine learning. Stencils are embarrassingly parallel, therefore fit on …