With shared microexponents, a little shifting goes a long way

B Darvish Rouhani, R Zhao, V Elango… - Proceedings of the 50th …, 2023 - dl.acm.org
This paper introduces Block Data Representations (BDR), a framework for exploring and
evaluating a wide spectrum of narrow-precision formats for deep learning. It enables …

A novel CNN gap layer for growth prediction of palm tree plantlings

TA Kumar, R Rajmohan, S Adeola Ajagbe, T Gaber… - Plos one, 2023 - journals.plos.org
Monitoring palm tree seedlings and plantlings presents a formidable challenge because of
the microscopic size of these organisms and the absence of distinguishing morphological …

Effective Interplay between Sparsity and Quantization: From Theory to Practice

SB Harma, A Chakraborty, E Kostenok… - arXiv preprint arXiv …, 2024 - arxiv.org
The increasing size of deep neural networks necessitates effective model compression to
improve computational efficiency and reduce their memory footprint. Sparsity and …

FPGA-based Block Minifloat Training Accelerator for a Time Series Prediction Network

W Zhou, H Qi, D Boland, PHW Leong - ACM Transactions on …, 2024 - dl.acm.org
Time series forecasting is the problem of predicting future data samples from historical
information and recent deep neural networks (DNNs) based techniques have achieved …

Exploration of Custom Floating-Point Formats: A Systematic Approach

S Yousefzadeh, Y Yang, A Peter… - 2024 27th Euromicro …, 2024 - ieeexplore.ieee.org
The remarkable advancements in AI algorithms over the past three decades have been
paralleled by an exponential growth in their complexity, with parameter counts soaring from …

[PDF][PDF] A Dataflow Compiler for Efficient LLM Inference using Custom Microscaling Formats

J Cheng, C Zhang, Z Yu, CS Bouganis… - cl.cam.ac.uk
Model quantization represents both parameters (weights) and intermediate values
(activations) in a more compact format, thereby directly reducing both computational and …