Hardware approximate techniques for deep neural network accelerators: A survey

G Armeniakos, G Zervakis, D Soudris… - ACM Computing …, 2022 - dl.acm.org
Deep Neural Networks (DNNs) are very popular because of their high performance in
various cognitive tasks in Machine Learning (ML). Recent advancements in DNNs have …

[HTML][HTML] Survey of deep learning accelerators for edge and emerging computing

S Alam, C Yakopcic, Q Wu, M Barnell, S Khan… - Electronics, 2024 - mdpi.com
The unprecedented progress in artificial intelligence (AI), particularly in deep learning
algorithms with ubiquitous internet connected smart devices, has created a high demand for …

A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully-Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous …

H Fujiwara, H Mori, WC Zhao… - … Solid-State Circuits …, 2022 - ieeexplore.ieee.org
Computing-in-memory (CIM) is being widely explored to minimize power consumption in
data movement and multiply-and-accumulate (MAC) for edge-AI devices. Although most …

Sparsity-aware and re-configurable NPU architecture for Samsung flagship mobile SoC

JW Jang, S Lee, D Kim, H Park… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Of late, deep neural networks have become ubiquitous in mobile applications. As mobile
devices generally require immediate response while maintaining user privacy, the demand …

A multi-mode 8k-mac hw-utilization-aware neural processing unit with a unified multi-precision datapath in 4-nm flagship mobile soc

JS Park, C Park, S Kwon, T Jeon… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org
This article presents an 8k-multiply-accumulate (MAC) neural processing unit (NPU) in 4-nm
mobile system-on-chip (SoC). The unified multi-precision MACs support from integer (INT) …

A heterogeneous and programmable compute-in-memory accelerator architecture for analog-ai using dense 2-d mesh

S Jain, H Tsai, CT Chen, R Muralidhar… - … Transactions on Very …, 2022 - ieeexplore.ieee.org
We introduce a highly heterogeneous and programmable compute-in-memory (CIM)
accelerator architecture for deep neural network (DNN) inference. This architecture …

Comprehending in-memory computing trends via proper benchmarking

NR Shanbhag, SK Roy - 2022 IEEE Custom Integrated Circuits …, 2022 - ieeexplore.ieee.org
Since its inception in 2014 [1], the modern version of in-memory computing (IMC) has
become an active area of research in integrated circuit design globally for realizing artificial …

Digital versus analog artificial intelligence accelerators: Advances, trends, and emerging designs

J Seo, J Saikia, J Meng, W He, H Suh… - IEEE Solid-State …, 2022 - ieeexplore.ieee.org
For state-of-the-art artificial intelligence (AI) accelerators, there have been large advances in
both all-digital and analog/mixed-signal circuit-based designs. This article presents a …

Nn-lut: neural approximation of non-linear operations for efficient transformer inference

J Yu, J Park, S Park, M Kim, S Lee, DH Lee… - Proceedings of the 59th …, 2022 - dl.acm.org
Non-linear operations such as GELU, Layer normalization, and Soft-max are essential yet
costly building blocks of Transformer models. Several prior works simplified these …

Hardware-aware dnn compression via diverse pruning and mixed-precision quantization

K Balaskas, A Karatzas, C Sad… - … on Emerging Topics …, 2024 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have shown significant advantages in a wide variety of
domains. However, DNNs are becoming computationally intensive and energy hungry at an …