作者
Qingbo Wang, Weirong Jiang, Yinglong Xia, Viktor Prasanna
发表日期
2010/12/8
研讨会论文
2010 International Conference on Field-Programmable Technology
页码范围
70-77
出版商
IEEE
简介
Breadth-first Search (BFS) is a fundamental graph problem. Due to the irregular nature of memory accesses to graph data structures, parallelization of BFS on cache-based systems leads to poor performance. Many issues, such as memory access latency, cache coherence policy, and inter-process synchronization, affect the throughput performance of BFS on such systems. In our proposed message-passing multi-softcore architecture, parallelization is achieved by exchanging information among autonomous softcores on FPGA. Several optimizations are performed to reduce the traffic on the interconnect and to enable designs with high clock rates. Implementations on a state of the art FPGA achieve clock rates in excess of 100 MHz. The sustained performance of our system ranges from 160 to 795 Million Edges Per Second on a DDR3 DRAM. This result approaches the upperbound set by the DRAM bandwidth, and …
引用总数
201220132014201520162017201820192020202120222023202421177216222
学术搜索中的文章
Q Wang, W Jiang, Y Xia, V Prasanna - 2010 International Conference on Field-Programmable …, 2010