A review of the optimal design of neural networks based on FPGA

C Wang, Z Luo - Applied Sciences, 2022 - mdpi.com
Deep learning based on neural networks has been widely used in image recognition,
speech recognition, natural language processing, automatic driving, and other fields and …

Heatvit: Hardware-efficient adaptive token pruning for vision transformers

P Dong, M Sun, A Lu, Y Xie, K Liu… - … Symposium on High …, 2023 - ieeexplore.ieee.org
While vision transformers (ViTs) have continuously achieved new milestones in the field of
computer vision, their sophisticated network architectures with high computation and …

Flightllm: Efficient large language model inference with a complete mapping flow on fpgas

S Zeng, J Liu, G Dai, X Yang, T Fu, H Wang… - Proceedings of the …, 2024 - dl.acm.org
Transformer-based Large Language Models (LLMs) have made a significant impact on
various domains. However, LLMs' efficiency suffers from both heavy computation and …

Scaling qubit readout with hardware efficient machine learning architectures

S Maurya, CN Mude, WD Oliver, B Lienhard… - Proceedings of the 50th …, 2023 - dl.acm.org
Reading a qubit is a fundamental operation in quantum computing. It translates quantum
information into classical information enabling subsequent classification to assign the qubit …

BARVINN: Arbitrary precision DNN accelerator controlled by a RISC-V CPU

M Askarihemmat, S Wagner, O Bilaniuk… - Proceedings of the 28th …, 2023 - dl.acm.org
We present a DNN accelerator that allows inference at arbitrary precision with dedicated
processing elements that are configurable at the bit level. Our DNN accelerator has 8 …

Wsq-addernet: Efficient weight standardization based quantized addernet fpga accelerator design with high-density int8 dsp-lut co-packing optimization

Y Zhang, B Sun, W Jiang, Y Ha, M Hu… - Proceedings of the 41st …, 2022 - dl.acm.org
Convolutional neural networks (CNNs) have been widely adopted for various machine
intelligence tasks. Nevertheless, CNNs are still known to be computational demanding due …

HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks

G Yang, J Lei, Z Fang, Y Li, J Zhang, W Xie - ACM Transactions on …, 2024 - dl.acm.org
Binary neural network (BNN), where both the weight and the activation values are
represented with one bit, provides an attractive alternative to deploy highly efficient deep …

DeepBurning-MixQ: An open source mixed-precision neural network accelerator design framework for FPGAs

E Luo, H Huang, C Liu, G Li, B Yang… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org
Mixed-precision neural networks (MPNNs) that enable the use of just enough data width for
a deep learning task promise significant advantages of both inference accuracy and …

Fast prototyping next-generation accelerators for new ml models using mase: Ml accelerator system exploration

J Cheng, C Zhang, Z Yu… - arXiv preprint arXiv …, 2023 - arxiv.org
Machine learning (ML) accelerators have been studied and used extensively to compute ML
models with high performance and low power. However, designing such accelerators …

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding

G Yuan, SE Chang, Q Jin, A Lu, Y Li, Y Wu… - … on Computer Vision, 2022 - Springer
Stochastic rounding is a critical technique used in low-precision deep neural networks
(DNNs) training to ensure good model accuracy. However, it requires a large number of …