Film-qnn: Efficient fpga acceleration of deep neural networks with intra-layer, mixed-precision...

C Wang, Z Luo - Applied Sciences, 2022 - mdpi.com

Deep learning based on neural networks has been widely used in image recognition,
speech recognition, natural language processing, automatic driving, and other fields and …

被引用次数：20 相关文章所有 5 个版本

[PDF] arxiv.org

Heatvit: Hardware-efficient adaptive token pruning for vision transformers

P Dong, M Sun, A Lu, Y Xie, K Liu… - … Symposium on High …, 2023 - ieeexplore.ieee.org

While vision transformers (ViTs) have continuously achieved new milestones in the field of
computer vision, their sophisticated network architectures with high computation and …

被引用次数：22 相关文章所有 7 个版本

[PDF] acm.org

Flightllm: Efficient large language model inference with a complete mapping flow on fpgas

S Zeng, J Liu, G Dai, X Yang, T Fu, H Wang… - Proceedings of the …, 2024 - dl.acm.org

Transformer-based Large Language Models (LLMs) have made a significant impact on
various domains. However, LLMs' efficiency suffers from both heavy computation and …

被引用次数：7 相关文章所有 8 个版本

[PDF] arxiv.org

Scaling qubit readout with hardware efficient machine learning architectures

S Maurya, CN Mude, WD Oliver, B Lienhard… - Proceedings of the 50th …, 2023 - dl.acm.org

Reading a qubit is a fundamental operation in quantum computing. It translates quantum
information into classical information enabling subsequent classification to assign the qubit …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

BARVINN: Arbitrary precision DNN accelerator controlled by a RISC-V CPU

M Askarihemmat, S Wagner, O Bilaniuk… - Proceedings of the 28th …, 2023 - dl.acm.org

We present a DNN accelerator that allows inference at arbitrary precision with dedicated
processing elements that are configurable at the bit level. Our DNN accelerator has 8 …

被引用次数：12 相关文章所有 5 个版本

Wsq-addernet: Efficient weight standardization based quantized addernet fpga accelerator design with high-density int8 dsp-lut co-packing optimization

Y Zhang, B Sun, W Jiang, Y Ha, M Hu… - Proceedings of the 41st …, 2022 - dl.acm.org

Convolutional neural networks (CNNs) have been widely adopted for various machine
intelligence tasks. Nevertheless, CNNs are still known to be computational demanding due …

被引用次数：10 相关文章所有 2 个版本

[PDF] sfu.ca

HyBNN: Quantifying and Optimizing Hardware Efficiency of Binary Neural Networks

G Yang, J Lei, Z Fang, Y Li, J Zhang, W Xie - ACM Transactions on …, 2024 - dl.acm.org

Binary neural network (BNN), where both the weight and the activation values are
represented with one bit, provides an attractive alternative to deploy highly efficient deep …

被引用次数：2 相关文章所有 7 个版本

[PDF] arxiv.org

DeepBurning-MixQ: An open source mixed-precision neural network accelerator design framework for FPGAs

E Luo, H Huang, C Liu, G Li, B Yang… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org

Mixed-precision neural networks (MPNNs) that enable the use of just enough data width for
a deep learning task promise significant advantages of both inference accuracy and …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Fast prototyping next-generation accelerators for new ml models using mase: Ml accelerator system exploration

J Cheng, C Zhang, Z Yu… - arXiv preprint arXiv …, 2023 - arxiv.org

Machine learning (ML) accelerators have been studied and used extensively to compute ML
models with high performance and low power. However, designing such accelerators …

被引用次数：2 相关文章所有 2 个版本

[PDF] nsf.gov

You Already Have It: A Generator-Free Low-Precision DNN Training Framework Using Stochastic Rounding

G Yuan, SE Chang, Q Jin, A Lu, Y Li, Y Wu… - … on Computer Vision, 2022 - Springer

Stochastic rounding is a critical technique used in low-precision deep neural networks
(DNNs) training to ensure good model accuracy. However, it requires a large number of …

被引用次数：2 相关文章所有 5 个版本

高级搜索

QQ 群