Efficient hardware architectures for accelerating deep neural networks: Survey

P Dhilleswararao, S Boppu, MS Manikandan… - IEEE …, 2022 - ieeexplore.ieee.org
In the modern-day era of technology, a paradigm shift has been witnessed in the areas
involving applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep …

A survey of coarse-grained reconfigurable architecture and design: Taxonomy, challenges, and applications

L Liu, J Zhu, Z Li, Y Lu, Y Deng, J Han, S Yin… - ACM Computing …, 2019 - dl.acm.org
As general-purpose processors have hit the power wall and chip fabrication cost escalates
alarmingly, coarse-grained reconfigurable architectures (CGRAs) are attracting increasing …

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org
Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

In-datacenter performance analysis of a tensor processing unit

NP Jouppi, C Young, N Patil, D Patterson… - Proceedings of the 44th …, 2017 - dl.acm.org
Many architects believe that major improvements in cost-energy-performance must now
come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor …

A configurable cloud-scale DNN processor for real-time AI

J Fowers, K Ovtcharov, M Papamichael… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
Interactive AI-powered services require low-latency evaluation of deep neural network
(DNN) models-aka"" real-time AI"". The growing demand for computationally expensive …

Domain-specific hardware accelerators

WJ Dally, Y Turakhia, S Han - Communications of the ACM, 2020 - dl.acm.org
Domain-specific hardware accelerators Page 1 48 COMMUNICATIONS OF THE ACM | JULY
2020 | VOL. 63 | NO. 7 contributed articles FROM THE SIMPLE embedded processor in your …

ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars

A Shafiee, A Nag, N Muralimanohar… - ACM SIGARCH …, 2016 - dl.acm.org
A number of recent efforts have attempted to design accelerators for popular machine
learning algorithms, such as those involving convolutional and deep neural networks (CNNs …

Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks

YH Chen, J Emer, V Sze - ACM SIGARCH computer architecture news, 2016 - dl.acm.org
Deep convolutional neural networks (CNNs) are widely used in modern AI systems for their
superior accuracy but at the cost of high computational complexity. The complexity comes …

Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects

H Kwon, A Samajdar, T Krishna - ACM SIGPLAN Notices, 2018 - dl.acm.org
Deep neural networks (DNN) have demonstrated highly promising results across computer
vision and speech recognition, and are becoming foundational for ubiquitous AI. The …

Fused-layer CNN accelerators

M Alwani, H Chen, M Ferdman… - 2016 49th Annual IEEE …, 2016 - ieeexplore.ieee.org
Deep convolutional neural networks (CNNs) are rapidly becoming the dominant approach to
computer vision and a major component of many other pervasive machine learning tasks …