Hardware approximate techniques for deep neural network accelerators: A survey

G Armeniakos, G Zervakis, D Soudris… - ACM Computing …, 2022 - dl.acm.org
Deep Neural Networks (DNNs) are very popular because of their high performance in
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …

Service caching and computation reuse strategies at the edge: A survey

C Barrios, M Kumar - ACM Computing Surveys, 2023 - dl.acm.org
With the proliferation of connected devices including smartphones, novel network
connectivity and management methods are needed to meet user Quality of Experience …

Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org
At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

[图书][B] Efficient processing of deep neural networks

V Sze, YH Chen, TJ Yang, JS Emer - 2020 - Springer
This book provides a structured treatment of the key principles and techniques for enabling
efficient processing of deep neural networks (DNNs). DNNs are currently widely used for …

Recnmp: Accelerating personalized recommendation with near-memory processing

L Ke, U Gupta, BY Cho, D Brooks… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …

Sparse tensor core: Algorithm and hardware co-design for vector-wise sparse neural networks on modern gpus

M Zhu, T Zhang, Z Gu, Y Xie - Proceedings of the 52nd Annual IEEE …, 2019 - dl.acm.org
Deep neural networks have become the compelling solution for the applications such as
image classification, object detection, speech recognition, and machine translation …

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

GoSPA: An energy-efficient high-performance globally optimized sparse convolutional neural network accelerator

C Deng, Y Sui, S Liao, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
The co-existence of activation sparsity and model sparsity in convolutional neural network
(CNN) models makes sparsity-aware CNN hardware designs very attractive. The existing …

Mesorasi: Architecture support for point cloud analytics via delayed-aggregation

Y Feng, B Tian, T Xu, P Whatmough… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org
Point cloud analytics is poised to become a key workload on battery-powered embedded
and mobile platforms in a wide range of emerging application domains, such as …

TIE: Energy-efficient tensor train-based inference engine for deep neural network

C Deng, F Sun, X Qian, J Lin, Z Wang… - Proceedings of the 46th …, 2019 - dl.acm.org
In the era of artificial intelligence (AI), deep neural networks (DNNs) have emerged as the
most important and powerful AI technique. However, large DNN models are both storage …