High-productivity programming and optimization framework for stream processing on fpga

J Lee, T Ueno, M Sato, K Sano - … of the 9th International Symposium on …, 2018 - dl.acm.org
Proceedings of the 9th International Symposium on Highly-Efficient …, 2018dl.acm.org
Because of the recent slowdown in silicon technology and increasing power consumption of
hardware, several dedicated architectures have been proposed in High Performance
Computing (HPC) to exploit the limited number of transistors in a chip with low power
consumption. Although Field-Programmable Gate Array (FPGA) is considered as one of the
promising technologies to realize dedicated hardware, it is difficult for non-domain experts to
program FPGAs due to the gap between thier applications and hardware-level programming …
Because of the recent slowdown in silicon technology and increasing power consumption of hardware, several dedicated architectures have been proposed in High Performance Computing (HPC) to exploit the limited number of transistors in a chip with low power consumption. Although Field-Programmable Gate Array (FPGA) is considered as one of the promising technologies to realize dedicated hardware, it is difficult for non-domain experts to program FPGAs due to the gap between thier applications and hardware-level programming models for FPGAs. In this paper, we propose a C/C++ based programming framework, C2SPD, to describe stream processing on FPGA. It uses SPGen, a dataflow High Level Synthesis (HSL) tool, as the FPGA backend. C2SPD can program FPGAs by adding directives to serial code, and the compiler translates it into optimized SPGen code. The range of application is limited in C2SPD due to its domain-specific approach. However it can generate efficient hardware on FPGAs when the programming model matches the target application. 2D-stencil computation is written in C2SPD to evaluate the performance of our C2SPD compiler implementation. 2 lines of the C2SPD directives are added into the serial C code to generate FPGA hardware. The Optimized FPGA hardware achieves 175.41 GFLOPS by using 256 stream cores, which is 220 times faster than a single stream core.
ACM Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果