Accelerating CNN inference on ASICs: A survey

D Moolchandani, A Kumar, SR Sarangi - Journal of Systems Architecture, 2021 - Elsevier
Convolutional neural networks (CNNs) have proven to be a disruptive technology in most
vision, speech and image processing tasks. Given their ubiquitous acceptance, the research …

[图书][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer
This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

SparTen: A sparse tensor accelerator for convolutional neural networks

A Gondimalla, N Chesnut, M Thottethodi… - Proceedings of the …, 2019 - dl.acm.org
Convolutional neural networks (CNNs) are emerging as powerful tools for image
processing. Recent machine learning work has reduced CNNs' compute and data volumes …

Understanding reuse, performance, and hardware cost of dnn dataflow: A data-centric approach

H Kwon, P Chatarasi, M Pellauer, A Parashar… - Proceedings of the …, 2019 - dl.acm.org
The data partitioning and scheduling strategies used by DNN accelerators to leverage reuse
and perform staging are known as dataflow, which directly impacts the performance and …

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

Think fast: A tensor streaming processor (TSP) for accelerating deep learning workloads

D Abts, J Ross, J Sparling… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
In this paper, we introduce the Tensor Streaming Processor (TSP) architecture, a functionally-
sliced microarchitecture with memory units interleaved with vector and matrix deep learning …

Mesorasi: Architecture support for point cloud analytics via delayed-aggregation

Y Feng, B Tian, T Xu, P Whatmough… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org
Point cloud analytics is poised to become a key workload on battery-powered embedded
and mobile platforms in a wide range of emerging application domains, such as …

Accpar: Tensor partitioning for heterogeneous deep learning accelerators

L Song, F Chen, Y Zhuo, X Qian, H Li… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
Deep neural network (DNN) accelerators as an example of domain-specific architecture
have demonstrated great success in DNN inference. However, the architecture acceleration …

ecnn: A block-based and highly-parallel cnn accelerator for edge inference

CT Huang, YC Ding, HC Wang, CW Weng… - Proceedings of the …, 2019 - dl.acm.org
Convolutional neural networks (CNNs) have recently demonstrated superior quality for
computational imaging applications. Therefore, they have great potential to revolutionize the …

Shapeshifter: Enabling fine-grain data width adaptation in deep learning

AD Lascorz, S Sharify, I Edo, DM Stuart… - Proceedings of the …, 2019 - dl.acm.org
We show that selecting a data width for all values in Deep Neural Networks, quantized or not
and even if that width is different per layer, amounts to worst-case design. Much shorter data …