Design principles for lifelong learning AI accelerators

D Kudithipudi, A Daram, AM Zyarah, FT Zohora… - Nature …, 2023 - nature.com
Lifelong learning—an agent's ability to learn throughout its lifetime—is a hallmark of
biological learning systems and a central challenge for artificial intelligence (AI). The …

An overview of energy-efficient hardware accelerators for on-device deep-neural-network training

J Lee, HJ Yoo - IEEE Open Journal of the Solid-State Circuits …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have been widely used in various artificial intelligence (AI)
applications due to their overwhelming performance. Furthermore, recently, several …

9.2 A 28nm 12.1 TOPS/W dual-mode CNN processor using effective-weight-based convolution and error-compensation-based prediction

H Mo, W Zhu, W Hu, G Wang, Q Li, A Li… - … Solid-State Circuits …, 2021 - ieeexplore.ieee.org
To deploy convolutional neural networks (CNNs) on edge devices efficiently, most existing
CNN processors were built on quantized CNNs to optimize the inference operations …

CHIMERA: A 0.92-TOPS, 2.2-TOPS/W edge AI accelerator with 2-MByte on-chip foundry resistive RAM for efficient training and inference

K Prabhu, A Gural, ZF Khan… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org
Implementing edge artificial intelligence (AI) inference and training is challenging with
current memory technologies. As deep neural networks (DNNs) grow in size, this problem is …

7.4 GANPU: A 135TFLOPS/W multi-DNN training processor for GANs with speculative dual-sparsity exploitation

S Kang, D Han, J Lee, D Im, S Kim… - … Solid-State Circuits …, 2020 - ieeexplore.ieee.org
Generative adversarial networks (GAN) have a wide range of applications, from image style
transfer to synthetic voice generation [1]. GAN applications on mobile devices, such as face …

9.3 a 40nm 4.81 TFLOPS/W 8b floating-point training processor for non-sparse neural networks using shared exponent bias and 24-way fused multiply-add tree

J Park, S Lee, D Jeon - 2021 IEEE International Solid-State …, 2021 - ieeexplore.ieee.org
Recent works on mobile deep-learning processors have presented designs that exploit
sparsity [2, 3], which is commonly found in various neural networks. However, due to the shift …

Direct feedback alignment with sparse connections for local learning

B Crafton, A Parihar, E Gebhardt… - Frontiers in …, 2019 - frontiersin.org
Recent advances in deep neural networks (DNNs) owe their success to training algorithms
that use backpropagation and gradient-descent. Backpropagation, while highly effective on …

A 65-nm neuromorphic image classification processor with energy-efficient training through direct spike-only feedback

J Park, J Lee, D Jeon - IEEE Journal of Solid-State Circuits, 2019 - ieeexplore.ieee.org
Recent advances in neural network (NN) and machine learning algorithms have sparked a
wide array of research in specialized hardware, ranging from high-performance NN …

GANPU: An energy-efficient multi-DNN training processor for GANs with speculative dual-sparsity exploitation

S Kang, D Han, J Lee, D Im, S Kim… - IEEE Journal of Solid …, 2021 - ieeexplore.ieee.org
This article presents generative adversarial network processing unit (GANPU), an energy-
efficient multiple deep neural network (DNN) training processor for GANs. It enables on …

A neural network training processor with 8-bit shared exponent bias floating point and multiple-way fused multiply-add trees

J Park, S Lee, D Jeon - IEEE Journal of Solid-State Circuits, 2021 - ieeexplore.ieee.org
Recent advances in deep neural networks (DNNs) and machine learning algorithms have
induced the demand for services based on machine learning algorithms that require a large …