Accuracy Boosters: Epoch-Driven Mixed-Mantissa Block Floating-Point for DNN Training

SB Harma, A Chakraborty, B Falsafi, M Jaggi… - arXiv preprint arXiv …, 2022 - arxiv.org
The unprecedented growth in DNN model complexity, size, and amount of training data has
led to a commensurate increase in demand for computing and a search for minimal …

Training dnns with hybrid block floating point

M Drumond, T Lin, M Jaggi… - Advances in Neural …, 2018 - proceedings.neurips.cc
The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing
datacenter operators to adopt domain-specific accelerators to train them. These accelerators …

TrainBF: High-Performance DNN Training Engine Using BFloat16 on AI Accelerators

Z Xie, S Raskar, M Emani, V Vishwanath - European Conference on …, 2023 - Springer
Training deep neural networks (DNNs) with half-precision floating-point formats is widely
supported on recent hardware and frameworks. However, current training approaches using …

Fast: Dnn training under variable precision block floating point with stochastic rounding

SQ Zhang, B McDanel, HT Kung - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
Block Floating Point (BFP) can efficiently support quantization for Deep Neural Network
(DNN) training by providing a wide dynamic range via a shared exponent across a group of …

FlexBlock: A flexible DNN training accelerator with multi-mode block floating point support

SH Noh, J Koo, S Lee, J Park… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
When training deep neural networks (DNNs), expensive floating point arithmetic units are
used in GPUs or custom neural processing units (NPUs). To reduce the burden of floating …

A Stochastic Rounding-Enabled Low-Precision Floating-Point MAC for DNN Training

SB Ali, SI Filip, O Sentieys - 2024 Design, Automation & Test in …, 2024 - ieeexplore.ieee.org
Training Deep Neural Networks (DNNs) can be computationally demanding, particularly
when dealing with large models. Recent work has aimed to mitigate this computational …

LBFP: Logarithmic block floating point arithmetic for deep neural networks

C Ni, J Lu, J Lin, Z Wang - 2020 IEEE Asia Pacific Conference …, 2020 - ieeexplore.ieee.org
Fixed-point quantization techniques have attracted considerable attention in deep neural
network (DNN) inference acceleration. Nevertheless, they often require time-consuming fine …

A block minifloat representation for training deep neural networks

S Fox, S Rasoulinezhad, J Faraone… - … Conference on Learning …, 2020 - openreview.net
Training Deep Neural Networks (DNN) with high efficiency can be difficult to achieve with
native floating-point representations and commercially available hardware. Specialized …

Throughput-oriented and Accuracy-aware DNN Training with BFloat16 on GPU

Z Xie, S Raskar, M Emani - 2022 IEEE International Parallel …, 2022 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have transformed the field of artificial intelligence and
achieved extraordinary success in many areas. The training of DNNs is commonly compute …

SEA: Sign-Separated Accumulation Scheme for Resource-Efficient DNN Accelerators

J Gong, H Saadat, H Javaid… - … , Automation & Test …, 2024 - ieeexplore.ieee.org
Deep neural network (DNN) accelerators targeting training need to support the resource-
hungry floating-point (FP) arithmetic. Typically, the additions (accumulations) are performed …