Riptide: A programmable, energy-minimal dataflow compiler and architecture

G Gobieski, S Ghosh, M Heule, T Mowry… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Emerging sensing applications create an unprecedented need for energy efficiency in
programmable processors. To achieve useful multi-year deployments on a small battery or …

Lisa: Graph neural network based portable mapping on spatial accelerators

Z Li, D Wu, D Wijerathne, T Mitra - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Spatial accelerators, such as Coarse-Grained Reconfigurable Arrays (CGRA), provide a
promising pathway to scale the performance and power efficiency of computing systems …

Aha: An agile approach to the design of coarse-grained reconfigurable accelerators and compilers

K Koul, J Melchert, K Sreedhar, L Truong… - ACM Transactions on …, 2023 - dl.acm.org
With the slowing of Moore's law, computer architects have turned to domain-specific
hardware specialization to continue improving the performance and efficiency of computing …

Aurora: Automated refinement of coarse-grained reconfigurable accelerators

C Tan, C Xie, A Li, KJ Barker… - 2021 Design, Automation …, 2021 - ieeexplore.ieee.org
Coarse-grained reconfigurable arrays (CGRAs), loosely defined as arrays of functional units
interconnected through a network-on-chip (NoC), provide higher flexibility than domain …

Energy efficient design of coarse-grained reconfigurable architectures: Insights, trends and challenges

E Aliagha, D Göhringer - 2022 International Conference on …, 2022 - ieeexplore.ieee.org
Coarse-Grained Reconfigurable Architectures (CGRAs) are promising solutions to achieve
more performance with the end of Moore's law. Thanks to word-level programmability, they …

Taskstream: Accelerating task-parallel workloads by recovering program structure

V Dadu, T Nowatzki - Proceedings of the 27th ACM International …, 2022 - dl.acm.org
Reconfigurable accelerators, like CGRAs and dataflow architectures, have come to
prominence for addressing data-processing problems. However, they are largely limited to …

Täkō: A polymorphic cache hierarchy for general-purpose optimization of data movement

BC Schwedock, P Yoovidhya, J Seibert… - Proceedings of the 49th …, 2022 - dl.acm.org
Current systems hide data movement from software behind the load-store interface.
Software's inability to observe and respond to data movement is the root cause of many …

SCALO: an accelerator-rich distributed system for scalable brain-computer interfacing

K Sriram, RP Pothukuchi, M Gerasimiuk… - Proceedings of the 50th …, 2023 - dl.acm.org
SCALO is the first distributed brain-computer interface (BCI) consisting of multiple wireless-
networked implants placed on different brain regions. SCALO unlocks new treatment options …

Tram: An open-source template-based reconfigurable architecture modeling framework

Y Qiu, Y Cao, Y Dai, W Yin… - 2022 32nd International …, 2022 - ieeexplore.ieee.org
Coarse-grained reconfigurable architecture (CGRA) is a promising accelerator design
choice due to its high performance and power efficiency in the computation or data-intensive …

Unified buffer: Compiling image processing and machine learning applications to push-memory accelerators

Q Liu, J Setter, D Huff, M Strange, K Feng… - ACM Transactions on …, 2023 - dl.acm.org
Image processing and machine learning applications benefit tremendously from hardware
acceleration. Existing compilers target either FPGAs, which sacrifice power and performance …