HNPU-V1: An Adaptive DNN Training Processor Utilizing Stochastic Dynamic Fixed-Point and Active Bit-Precision Searching

D Han, HJ Yoo - On-Chip Training NPU-Algorithm, Architecture and …, 2023 - Springer
This chapter presents HNPU, which is an energy-efficient DNN training processor by
adopting algorithm–hardware co-design. The HNPU supports stochastic dynamic fixed-point …

Maximizing communication efficiency for large-scale training via 0/1 adam

Y Lu, C Li, M Zhang, C De Sa, Y He - arXiv preprint arXiv:2202.06009, 2022 - arxiv.org
1-bit gradient compression and local steps are two representative techniques that enable
drastic communication reduction in distributed SGD. Their benefits, however, remain an …

A mobile DNN training processor with automatic bit precision search and fine-grained sparsity exploitation

D Han, D Im, G Park, Y Kim, S Song, J Lee… - IEEE Micro, 2021 - ieeexplore.ieee.org
In this article, an energy-efficient deep learning processor is proposed for deep neural
network (DNN) training in mobile platforms. Conventional mobile DNN training processors …

[图书][B] On-Chip Training NPU-Algorithm, Architecture and SoC Design

D Han, HJ Yoo - 2023 - Springer
Deep learning becomes the mainstream of artificial intelligence applications and its demand
is increasing day by day. Previously, deep learning was only considered for cloud-server …

PWPROP: A Progressive Weighted Adaptive Method for Training Deep Neural Networks

D Wang, T Xu, H Zhang, F Shang, H Liu… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org
In recent years, adaptive optimization methods for deep learning have attracted
considerable attention. AMSGRAD indicates that the adaptive methods may be hard to …

An Energy-Efficient Deep Neural Network Training Processor with Bit-Slice-Level Reconfigurability and Sparsity Exploitation

D Han, D Im, G Park, Y Kim, S Song… - … IEEE Symposium in …, 2021 - ieeexplore.ieee.org
This paper presents an energy-efficient deep neural network (DNN) training processor
through the four key features: 1) Layer-wise Adaptive bit-Precision Scaling (LAPS) with 2) In …

Introduction to Semi-discrete Calculus

A Shachar - arXiv preprint arXiv:1012.5751, 2010 - arxiv.org
The Infinitesimal Calculus explores mainly two measurements: the instantaneous rates of
change and the accumulation of quantities. This work shows that scientists, engineers …