A configurable cloud-scale DNN processor for real-time AI

J Fowers, K Ovtcharov, M Papamichael… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
Interactive AI-powered services require low-latency evaluation of deep neural network
(DNN) models-aka"" real-time AI"". The growing demand for computationally expensive …

A survey of FPGA-based neural network accelerator

K Guo, S Zeng, J Yu, Y Wang, H Yang - arXiv preprint arXiv:1712.08934, 2017 - arxiv.org
Recent researches on neural network have shown significant advantage in machine
learning over traditional algorithms based on handcrafted features and models. Neural …

DrAcc: A DRAM based accelerator for accurate CNN inference

Q Deng, L Jiang, Y Zhang, M Zhang… - Proceedings of the 55th …, 2018 - dl.acm.org
Modern Convolutional Neural Networks (CNNs) are computation and memory intensive.
Thus it is crucial to develop hardware accelerators to achieve high performance as well as …

Virtualizing FPGAs in the cloud

Y Zha, J Li - Proceedings of the Twenty-Fifth International …, 2020 - dl.acm.org
Field-Programmable Gate Arrays (FPGAs) have been integrated into the cloud infrastructure
to enhance its computing performance by supporting on-demand acceleration. However …

[DL] A survey of FPGA-based neural network inference accelerators

K Guo, S Zeng, J Yu, Y Wang, H Yang - ACM Transactions on …, 2019 - dl.acm.org
Recent research on neural networks has shown a significant advantage in machine learning
over traditional algorithms based on handcrafted features and models. Neural networks are …

On-chip memory based binarized convolutional deep neural network applying batch normalization free technique on an FPGA

H Yonekawa, H Nakahara - 2017 IEEE international parallel …, 2017 - ieeexplore.ieee.org
A pre-trained convolutional deep neural network (CNN) is a feed-forward computation
perspective, which is widely used for the embedded systems, requires highly power-and …

Scalable high-performance architecture for convolutional ternary neural networks on FPGA

A Prost-Boucle, A Bourge, F Pétrot… - … conference on field …, 2017 - ieeexplore.ieee.org
Thanks to their excellent performances on typical artificial intelligence problems, deep
neural networks have drawn a lot of interest lately. However, this comes at the cost of large …

Xnor-pop: A processing-in-memory architecture for binary convolutional neural networks in wide-io2 drams

L Jiang, M Kim, W Wen, D Wang - 2017 IEEE/ACM International …, 2017 - ieeexplore.ieee.org
It is challenging to adopt computing-intensive and parameter-rich Convolutional Neural
Networks (CNNs) in mobile devices due to limited hardware resources and low power …

[图书][B] Robotic computing on fpgas

S Liu, Z Wan, B Yu, Y Wang - 2021 - Springer
This book provides a thorough overview of the state-of-the-art field-programmable gate array
(FPGA)-based robotic computing accelerator designs and summarizes their adopted …

A hardware-friendly low-bit power-of-two quantization method for cnns and its fpga implementation

X Sui, Q Lv, Y Bai, B Zhu, L Zhi, Y Yang, Z Tan - Sensors, 2022 - mdpi.com
To address the problems of convolutional neural networks (CNNs) consuming more
hardware resources (such as DSPs and RAMs on FPGAs) and their accuracy, efficiency …