SignADAM++: Learning confidences for deep neural networks

HNPU-V1: An Adaptive DNN Training Processor Utilizing Stochastic Dynamic Fixed-Point and Active Bit-Precision Searching

D Han, HJ Yoo - On-Chip Training NPU-Algorithm, Architecture and …, 2023 - Springer

This chapter presents HNPU, which is an energy-efficient DNN training processor by
adopting algorithm–hardware co-design. The HNPU supports stochastic dynamic fixed-point …

被引用次数：51 相关文章所有 5 个版本

[PDF] arxiv.org

Maximizing communication efficiency for large-scale training via 0/1 adam

Y Lu, C Li, M Zhang, C De Sa, Y He - arXiv preprint arXiv:2202.06009, 2022 - arxiv.org

1-bit gradient compression and local steps are two representative techniques that enable
drastic communication reduction in distributed SGD. Their benefits, however, remain an …

被引用次数：18 相关文章所有 3 个版本

A mobile DNN training processor with automatic bit precision search and fine-grained sparsity exploitation

D Han, D Im, G Park, Y Kim, S Song, J Lee… - IEEE Micro, 2021 - ieeexplore.ieee.org

In this article, an energy-efficient deep learning processor is proposed for deep neural
network (DNN) training in mobile platforms. Conventional mobile DNN training processors …

被引用次数：2 相关文章所有 5 个版本

[图书][B] On-Chip Training NPU-Algorithm, Architecture and SoC Design

D Han, HJ Yoo - 2023 - Springer

Deep learning becomes the mainstream of artificial intelligence applications and its demand
is increasing day by day. Previously, deep learning was only considered for cloud-server …

PWPROP: A Progressive Weighted Adaptive Method for Training Deep Neural Networks

D Wang, T Xu, H Zhang, F Shang, H Liu… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org

In recent years, adaptive optimization methods for deep learning have attracted
considerable attention. AMSGRAD indicates that the adaptive methods may be hard to …

An Energy-Efficient Deep Neural Network Training Processor with Bit-Slice-Level Reconfigurability and Sparsity Exploitation

D Han, D Im, G Park, Y Kim, S Song… - … IEEE Symposium in …, 2021 - ieeexplore.ieee.org

This paper presents an energy-efficient deep neural network (DNN) training processor
through the four key features: 1) Layer-wise Adaptive bit-Precision Scaling (LAPS) with 2) In …

Introduction to Semi-discrete Calculus

A Shachar - arXiv preprint arXiv:1012.5751, 2010 - arxiv.org

The Infinitesimal Calculus explores mainly two measurements: the instantaneous rates of
change and the accumulation of quantities. This work shows that scientists, engineers …

高级搜索

QQ 群