FPGA based CNN accelerator for high-speed biomedical application

H Nehete, G Verma, A Gupta… - High-Speed …, 2023 - spiedigitallibrary.org
High-Speed Biomedical Imaging and Spectroscopy VIII, 2023spiedigitallibrary.org
The rise in visual dataset generation has necessitated the recent advancements in the field
of Deep Neural Networks (DNNs). Application domains like biomedical imaging require a
high level of precision which is suitably achieved using convolutional neural networks
(CNNs) at the expense of increased computation, hardware, and power resources. The
implementation of such CNN architectures is constrained by currently available resource
limited embedded and application-specific integrated circuit (ASIC) systems. In this work, a …
The rise in visual dataset generation has necessitated the recent advancements in the field of Deep Neural Networks (DNNs). Application domains like biomedical imaging require a high level of precision which is suitably achieved using convolutional neural networks (CNNs) at the expense of increased computation, hardware, and power resources. The implementation of such CNN architectures is constrained by currently available resource limited embedded and application-specific integrated circuit (ASIC) systems. In this work, a field-programmable gate array (FPGA) based hardware accelerator having a generalized architecture for convolution and fully connected (FC) layers has been presented that exploits a massive level of intra-layer parallelism. Compute intensive convolution layers are replaced by depthwise separable (DS) convolution layers that reduced the number of computations and memory access by 7.8x and 10x respectively for VGG8 network after detailed design space exploration. Furthermore, parallel computation of arithmetic tasks reduced the compute bound for the proposed architecture. Reduced precision data type for both input and weights resulted in overall reduction in latency and resource utilization. FPGA implementation results of the proposed CNN accelerator for classifiers trained on subsets of MedMNIST dataset depict a balance between high performance of 214.5 GOP/s for DS convolution layer and low resource utilization.
SPIE Digital Library
以上显示的是最相近的搜索结果。 查看全部搜索结果