Compressing DMA engine: Leveraging activation sparsity for training deep neural networks

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

被引用次数：948 相关文章所有 2 个版本

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

被引用次数：829 相关文章所有 27 个版本

[PDF] gatech.edu

Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training

E Qin, A Samajdar, H Kwon, V Nadella… - … Symposium on High …, 2020 - ieeexplore.ieee.org

The advent of Deep Learning (DL) has radically transformed the computing industry across
the entire spectrum from algorithms to circuits. As myriad application domains embrace DL, it …

被引用次数：475 相关文章所有 7 个版本

[PDF] acm.org

SPINN: synergistic progressive inference of neural networks over device and cloud

S Laskaridis, SI Venieris, M Almeida… - Proceedings of the 26th …, 2020 - dl.acm.org

Despite the soaring use of convolutional neural networks (CNNs) in mobile applications,
uniformly sustaining high-performance inference on mobile has been elusive due to the …

被引用次数：298 相关文章所有 5 个版本

[PDF] umich.edu

Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org

At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

被引用次数：573 相关文章所有 6 个版本

[PDF] mlr.press

Inducing and exploiting activation sparsity for fast inference on deep neural networks

M Kurtz, J Kopinsky, R Gelashvili… - International …, 2020 - proceedings.mlr.press

Optimizing convolutional neural networks for fast inference has recently become an
extremely active area of research. One of the go-to solutions in this context is weight …

被引用次数：173 相关文章所有 7 个版本

[PDF] arxiv.org

Recnmp: Accelerating personalized recommendation with near-memory processing

L Ke, U Gupta, BY Cho, D Brooks… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org

Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …

被引用次数：246 相关文章所有 11 个版本

[PDF] acm.org

Resource-efficient convolutional networks: A survey on model-, arithmetic-, and implementation-level techniques

JK Lee, L Mukhanov, AS Molahosseini… - ACM Computing …, 2023 - dl.acm.org

Convolutional neural networks (CNNs) are used in our daily life, including self-driving cars,
virtual assistants, social network services, healthcare services, and face recognition, among …

被引用次数：34 相关文章所有 8 个版本

[PDF] arxiv.org

Tensordimm: A practical near-memory processing architecture for embeddings and tensor operations in deep learning

Y Kwon, Y Lee, M Rhu - Proceedings of the 52nd Annual IEEE/ACM …, 2019 - dl.acm.org

Recent studies from several hyperscalars pinpoint to embedding layers as the most memory-
intensive deep learning (DL) algorithm being deployed in today's datacenters. This paper …

被引用次数：243 相关文章所有 6 个版本

[PDF] arxiv.org

The lazy neuron phenomenon: On emergence of activation sparsity in transformers

Z Li, C You, S Bhojanapalli, D Li, AS Rawat… - arXiv preprint arXiv …, 2022 - arxiv.org

This paper studies the curious phenomenon for machine learning models with Transformer
architectures that their activation maps are sparse. By activation map we refer to the …

被引用次数：74 相关文章所有 4 个版本

高级搜索

QQ 群