Euphrates: Algorithm-soc co-design for low-power mobile continuous vision

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org

At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

被引用次数：526 相关文章所有 6 个版本

[PDF] mit.edu

[图书][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer

This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

被引用次数：252 相关文章所有 6 个版本

[PDF] arxiv.org

Scale-sim: Systolic cnn accelerator simulator

A Samajdar, Y Zhu, P Whatmough, M Mattina… - arXiv preprint arXiv …, 2018 - arxiv.org

Systolic Arrays are one of the most popular compute substrates within Deep Learning
accelerators today, as they provide extremely high efficiency for running dense matrix …

被引用次数：298 相关文章所有 3 个版本

[PDF] horizon-lab.org

A systematic methodology for characterizing scalability of dnn accelerators using scale-sim

A Samajdar, JM Joseph, Y Zhu… - … Analysis of Systems …, 2020 - ieeexplore.ieee.org

The compute demand for deep learning workloads is well known and is a prime motivator for
powerful parallel computing platforms such as GPUs or dedicated hardware accelerators …

被引用次数：163 相关文章所有 5 个版本

[PDF] arxiv.org

S2ta: Exploiting structured sparsity for energy-efficient mobile cnn acceleration

ZG Liu, PN Whatmough, Y Zhu… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

Exploiting sparsity is a key technique in accelerating quantized convolutional neural network
(CNN) inference on mobile devices. Prior sparse CNN accelerators largely exploit …

被引用次数：62 相关文章所有 4 个版本

[PDF] ieee.org

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org

Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

被引用次数：80 相关文章所有 7 个版本

[PDF] microarch.org

Building the computing system for autonomous micromobility vehicles: Design constraints and architectural optimizations

B Yu, W Hu, L Xu, J Tang, S Liu… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org

This paper presents the computing system design in our commercial autonomous vehicles,
and provides a detailed performance, energy, and cost analyses. Drawing from our …

被引用次数：79 相关文章所有 4 个版本

[PDF] arxiv.org

High-throughput cnn inference on embedded arm big. little multicore processors

S Wang, G Ananthanarayanan, Y Zeng… - … on Computer-Aided …, 2019 - ieeexplore.ieee.org

Internet of Things edge intelligence requires convolutional neural network (CNN) inference
to take place in the edge devices itself. ARM big. LITTLE architecture is at the heart of …

被引用次数：125 相关文章所有 5 个版本

[PDF] researchgate.net

Think fast: A tensor streaming processor (TSP) for accelerating deep learning workloads

D Abts, J Ross, J Sparling… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org

In this paper, we introduce the Tensor Streaming Processor (TSP) architecture, a functionally-
sliced microarchitecture with memory units interleaved with vector and matrix deep learning …

被引用次数：89 相关文章所有 10 个版本

An overview of sparsity exploitation in CNNs for on-device intelligence with software-hardware cross-layer optimizations

S Kang, G Park, S Kim, S Kim, D Han… - IEEE Journal on …, 2021 - ieeexplore.ieee.org

This paper presents a detailed overview of sparsity exploitation in deep neural network
(DNN) accelerators. Despite the algorithmic advancements which drove DNNs to become …

被引用次数：15 相关文章所有 4 个版本

高级搜索

QQ 群