An overview of energy-efficient hardware accelerators for on-device deep-neural-network training

J Lee, HJ Yoo - IEEE Open Journal of the Solid-State Circuits …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have been widely used in various artificial intelligence (AI)
applications due to their overwhelming performance. Furthermore, recently, several …

High-precision method and architecture for base-2 softmax function in dnn training

Y Zhang, L Peng, L Quan, Y Zhang… - … on Circuits and …, 2023 - ieeexplore.ieee.org
Softmax is a common and complex activation function in Deep Neural Networks (DNN).
However, it is a challenge to apply it efficiently in DNN training hardware accelerator …

React: a heterogeneous reconfigurable neural network accelerator with software-configurable nocs for training and inference on wearables

M Upadhyay, R Juneja, B Wang, J Zhou… - Proceedings of the 59th …, 2022 - dl.acm.org
On-chip training improves model accuracy on personalised user data and preserves privacy.
This work proposes REACT, an AI accelerator for wearables that has heterogeneous cores …

Evaluation of Optimizers for Predicting Epilepsy Seizures

P Manju, BR Devassy, K Vidyamol… - … Technologies for High …, 2023 - ieeexplore.ieee.org
Deep learning is now widely used in many fields, including engineering and medicine. CNN
is the most widely used deep learning model due to its high accuracy in image recognition …

Online Training Refinement Network and Architecture Design for Stereo Matching

YS Wu, SS Wu, T Huang… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Sending local data to cloud servers is vulnerable to user privacy, and its long update
latency. Meanwhile, the state-of-the-art stereo matching method is still computation …

Hot Chips 2020 Posters

F Elsabbagh, B Tine, A Chawda… - 2020 IEEE Hot Chips …, 2020 - ieeexplore.ieee.org
The emergence of data parallel architectures have enabled new opportunities to address
the power limitations and scalability of multi-core processors, allowing new ways to exploit …

[PDF][PDF] ELearn: Edge Learning Processor with Bidirectional Speculation and Sparsity & Mixed-Precision aware Dataflow Parallelism Reconfiguration

F Tu - hc32.hotchips.org
To achieve higher accuracy on local devices, the necessity of retraining a DNN model on
edge increases. This paper proposes a sparsity and mixed-precision aware edge learning …