Hardware approximate techniques for deep neural network accelerators: A survey
Deep Neural Networks (DNNs) are very popular because of their high performance in
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …
[HTML][HTML] Survey of deep learning accelerators for edge and emerging computing
The unprecedented progress in artificial intelligence (AI), particularly in deep learning
algorithms with ubiquitous internet connected smart devices, has created a high demand for …
algorithms with ubiquitous internet connected smart devices, has created a high demand for …
A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully-Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous …
H Fujiwara, H Mori, WC Zhao… - … Solid-State Circuits …, 2022 - ieeexplore.ieee.org
Computing-in-memory (CIM) is being widely explored to minimize power consumption in
data movement and multiply-and-accumulate (MAC) for edge-AI devices. Although most …
data movement and multiply-and-accumulate (MAC) for edge-AI devices. Although most …
Sparsity-aware and re-configurable NPU architecture for Samsung flagship mobile SoC
Of late, deep neural networks have become ubiquitous in mobile applications. As mobile
devices generally require immediate response while maintaining user privacy, the demand …
devices generally require immediate response while maintaining user privacy, the demand …
A multi-mode 8k-mac hw-utilization-aware neural processing unit with a unified multi-precision datapath in 4-nm flagship mobile soc
JS Park, C Park, S Kwon, T Jeon… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org
This article presents an 8k-multiply-accumulate (MAC) neural processing unit (NPU) in 4-nm
mobile system-on-chip (SoC). The unified multi-precision MACs support from integer (INT) …
mobile system-on-chip (SoC). The unified multi-precision MACs support from integer (INT) …
A heterogeneous and programmable compute-in-memory accelerator architecture for analog-ai using dense 2-d mesh
We introduce a highly heterogeneous and programmable compute-in-memory (CIM)
accelerator architecture for deep neural network (DNN) inference. This architecture …
accelerator architecture for deep neural network (DNN) inference. This architecture …
Comprehending in-memory computing trends via proper benchmarking
NR Shanbhag, SK Roy - 2022 IEEE Custom Integrated Circuits …, 2022 - ieeexplore.ieee.org
Since its inception in 2014 [1], the modern version of in-memory computing (IMC) has
become an active area of research in integrated circuit design globally for realizing artificial …
become an active area of research in integrated circuit design globally for realizing artificial …
Digital versus analog artificial intelligence accelerators: Advances, trends, and emerging designs
For state-of-the-art artificial intelligence (AI) accelerators, there have been large advances in
both all-digital and analog/mixed-signal circuit-based designs. This article presents a …
both all-digital and analog/mixed-signal circuit-based designs. This article presents a …
Nn-lut: neural approximation of non-linear operations for efficient transformer inference
Non-linear operations such as GELU, Layer normalization, and Soft-max are essential yet
costly building blocks of Transformer models. Several prior works simplified these …
costly building blocks of Transformer models. Several prior works simplified these …
Hardware-aware dnn compression via diverse pruning and mixed-precision quantization
Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of
domains. However, DNNs are becoming computationally intensive and energy hungry at an …
domains. However, DNNs are becoming computationally intensive and energy hungry at an …