Cnvlutin: Ineffectual-neuron-free deep neural network computing

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

被引用次数：839 相关文章所有 2 个版本

[PDF] ieee.org

Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org

Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

被引用次数：83 相关文章所有 5 个版本

[PDF] arxiv.org

Dynamic neural networks: A survey

Y Han, G Huang, S Song, L Yang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Dynamic neural network is an emerging research topic in deep learning. Compared to static
models which have fixed computational graphs and parameters at the inference stage …

被引用次数：653 相关文章所有 7 个版本

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

被引用次数：716 相关文章所有 27 个版本

[PDF] arxiv.org

Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices

YH Chen, TJ Yang, J Emer… - IEEE Journal on Emerging …, 2019 - ieeexplore.ieee.org

A recent trend in deep neural network (DNN) development is to extend the reach of deep
learning applications to platforms that are more resource and energy-constrained, eg …

被引用次数：1024 相关文章所有 6 个版本

[PDF] umich.edu

Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org

At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

被引用次数：541 相关文章所有 6 个版本

[PDF] google.com

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org

Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

被引用次数：412 相关文章所有 3 个版本

[PDF] ulisboa.pt

A configurable cloud-scale DNN processor for real-time AI

J Fowers, K Ovtcharov, M Papamichael… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org

Interactive AI-powered services require low-latency evaluation of deep neural network
(DNN) models-aka"" real-time AI"". The growing demand for computationally expensive …

被引用次数：674 相关文章所有 12 个版本

[PDF] acm.org

In-datacenter performance analysis of a tensor processing unit

NP Jouppi, C Young, N Patil, D Patterson… - Proceedings of the 44th …, 2017 - dl.acm.org

Many architects believe that major improvements in cost-energy-performance must now
come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor …

被引用次数：5537 相关文章所有 12 个版本

[PDF] arxiv.org

Efficient processing of deep neural networks: A tutorial and survey

V Sze, YH Chen, TJ Yang, JS Emer - Proceedings of the IEEE, 2017 - ieeexplore.ieee.org

Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI)
applications including computer vision, speech recognition, and robotics. While DNNs …

被引用次数：4221 相关文章所有 16 个版本

高级搜索

QQ 群