A 28nm 29.2 TFLOPS/W BF16 and 36.5 TOPS/W INT8 reconfigurable digital CIM processor with unified FP/INT pipeline and bitwise in-memory booth multiplication for …

F Tu, Y Wang, Z Wu, L Liang, Y Ding… - … Solid-State Circuits …, 2022 - ieeexplore.ieee.org
Many computing-in-memory (CIM) processors have been proposed for edge deep learning
(DL) acceleration. They usually rely on analog CIM techniques to achieve high-efficiency NN …

An overview of energy-efficient hardware accelerators for on-device deep-neural-network training

J Lee, HJ Yoo - IEEE Open Journal of the Solid-State Circuits …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have been widely used in various artificial intelligence (AI)
applications due to their overwhelming performance. Furthermore, recently, several …

An overview of sparsity exploitation in CNNs for on-device intelligence with software-hardware cross-layer optimizations

S Kang, G Park, S Kim, S Kim, D Han… - IEEE Journal on …, 2021 - ieeexplore.ieee.org
This paper presents a detailed overview of sparsity exploitation in deep neural network
(DNN) accelerators. Despite the algorithmic advancements which drove DNNs to become …

ReDCIM: Reconfigurable digital computing-in-memory processor with unified FP/INT pipeline for cloud AI acceleration

F Tu, Y Wang, Z Wu, L Liang, Y Ding… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org
Cloud AI acceleration has drawn great attention in recent years, as big models are
becoming a popular trend in deep learning. Cloud AI runs high-efficiency inference, high …

T-PIM: An energy-efficient processing-in-memory accelerator for end-to-end on-device training

J Heo, J Kim, S Lim, W Han… - IEEE Journal of Solid-State …, 2022 - ieeexplore.ieee.org
Recently, on-device training has become crucial for the success of edge intelligence.
However, frequent data movement between computing units and memory during training …

C-DNN: A 24.5-85.8 TOPS/W complementary-deep-neural-network processor with heterogeneous CNN/SNN core architecture and forward-gradient-based sparsity …

S Kim, S Kim, S Hong, S Kim, D Han… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Spiking-Neural-Networks (SNNs) have been studied for a long time, and recently have been
shown to achieve the same accuracy as Convolutional-Neural-Networks (CNNs). By using …

Comprehending in-memory computing trends via proper benchmarking

NR Shanbhag, SK Roy - 2022 IEEE Custom Integrated Circuits …, 2022 - ieeexplore.ieee.org
Since its inception in 2014 [1], the modern version of in-memory computing (IMC) has
become an active area of research in integrated circuit design globally for realizing artificial …

7.8 A 22nm delta-sigma computing-in-memory (Δ∑ CIM) SRAM macro with near-zero-mean outputs and LSB-first ADCs achieving 21.38 TOPS/W for 8b-MAC edge AI …

P Chen, M Wu, W Zhao, J Cui, Z Wang… - … Solid-State Circuits …, 2023 - ieeexplore.ieee.org
In Al-edge devices, the changes of input features are normally progressive or occasional,
eg, abnormal surveillance, hence the reprocessing of unchanged data consumes a …

Benchmarking in-memory computing architectures

NR Shanbhag, SK Roy - IEEE Open Journal of the Solid-State …, 2022 - ieeexplore.ieee.org
In-memory computing (IMC) architectures have emerged as a compelling platform to
implement energy-efficient machine learning (ML) systems. However, today, the energy …

FlexBlock: A flexible DNN training accelerator with multi-mode block floating point support

SH Noh, J Koo, S Lee, J Park… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
When training deep neural networks (DNNs), expensive floating point arithmetic units are
used in GPUs or custom neural processing units (NPUs). To reduce the burden of floating …