Modular and lean architecture with elasticity for sparse matrix vector multiplication on fpgas

AK Jain, C Ravishankar, H Omidian… - 2023 IEEE 31st …, 2023 - ieeexplore.ieee.org
The use of domain-specific accelerators is becoming prominent for a variety of emerging
domains such as graph analytics and HPC, where most of the computations revolve around …

Graphitron: A domain specific language for fpga-based graph processing accelerator generation

X Zhang, Z Feng, S Liang, X Chen, C Liu, H Li… - arXiv preprint arXiv …, 2024 - arxiv.org
FPGA-based graph processing accelerators, enabling extensive customization, have
demonstrated significant energy efficiency over general computing engines like CPUs and …

F-tadoc: Fpga-based text analytics directly on compression with hls

Y Zhou, F Zhang, T Lin, Y Huang… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
With the development of loT and edge computing, data analytics on edge has become
popular, and text analytics directly on compression (TADOC) has been proven to be a …

Graphset: High performance graph mining through equivalent set transformations

T Shi, J Zhai, H Wang, Q Chen, M Zhai, Z Hao… - Proceedings of the …, 2023 - dl.acm.org
Graph mining is of critical use in a number of fields such as social networks, knowledge
graphs, and fraud detection. As an NP-complete problem, accelerating computation …

ScalaBFS2: A High-performance BFS Accelerator on an HBM-enhanced FPGA Chip

K Li, S Xu, Z Shao, R Zheng, X Liao, H Jin - ACM Transactions on …, 2024 - dl.acm.org
The introduction of High Bandwidth Memory (HBM) to the FPGA chip makes it possible for
an FPGA-based accelerator to leverage the huge memory bandwidth of HBM to improve its …

Graph-opu: A highly integrated fpga-based overlay processor for graph neural networks

R Chen, H Zhang, S Li, E Tang, J Yu… - 2023 33rd International …, 2023 - ieeexplore.ieee.org
Field-programmable gate array (FPGA) is an ideal candidate for accelerating graph neural
networks (GNNs). However, FPGA reconfiguration is a time-consuming process when …

High-performance and resource-efficient dynamic memory management in high-level synthesis

Q Wang, L Zheng, Z An, H Huang, H Zhu… - Proceedings of the 61st …, 2024 - dl.acm.org
With the merits of high productivity and ease of use, highlevel synthesis (HLS) tools bring
hope to fast FPGA-based architecture development. However, their usability and popularity …

LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics

Z Que, H Fan, M Loo, H Li, M Blott, M Pierini… - ACM Transactions on …, 2024 - dl.acm.org
This work presents a novel reconfigurable architecture for Low Latency Graph Neural
Network (LL-GNN) designs for particle detectors, delivering unprecedented low latency …

Sagraph: A similarity-aware hardware accelerator for temporal graph processing

J Zhao, Y Zhang, J Cheng, Y Wu, C Ye… - 2023 60th ACM/IEEE …, 2023 - ieeexplore.ieee.org
Temporal graph processing is used to handle the snapshots of the temporal graph, which
concerns changes in graph over time. Although several software/hardware solutions have …

PRAGA: A Priority-Aware Hardware/Software Co-design for High-Throughput Graph Processing Acceleration

L Zheng, B Zhu, P Yao, Y Zhou, C Pan… - ACM Transactions on …, 2024 - dl.acm.org
Graph processing is pivotal in deriving insights from complex data structures but faces
performance limitations due to the irregular nature of graphs. Traditional general-purpose …