A study of BFLOAT16 for deep learning training (2019)

A Aghajanyan, L Yu, A Conneau… - International …, 2023 - proceedings.mlr.press

Generative language models define distributions over sequences of tokens that can
represent essentially any combination of data modalities (eg, any permutation of image …

被引用次数：61 相关文章所有 6 个版本

[PDF] arxiv.org

Reduced precision floating-point optimization for Deep Neural Network On-Device Learning on microcontrollers

D Nadalini, M Rusci, L Benini, F Conti - Future Generation Computer …, 2023 - Elsevier

Abstract Enabling On-Device Learning (ODL) for Ultra-Low-Power Micro-Controller Units
(MCUs) is a key step for post-deployment adaptation and fine-tuning of Deep Neural …

被引用次数：16 相关文章所有 8 个版本

[PDF] arxiv.org

Autosparse: Towards automated sparse training of deep neural networks

A Kundu, NK Mellempudi, DT Vooturi, B Kaul… - arXiv preprint arXiv …, 2023 - arxiv.org

Sparse training is emerging as a promising avenue for reducing the computational cost of
training neural networks. Several recent studies have proposed pruning methods using …

被引用次数：3 相关文章所有 3 个版本

[PDF] researchgate.net

Small reals representations for deep learning at the edge: A comparison

M Cococcioni, F Rossi, E Ruffaldi… - Conference on Next …, 2022 - Springer

The pervasiveness of deep neural networks (DNNs) in edge devices enforces new
requirements on information representation. Low precision formats from 16 bits down to 1 or …

被引用次数：8 相关文章所有 7 个版本

[PDF] arxiv.org

Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough

K Dobler, G de Melo - arXiv preprint arXiv:2408.15793, 2024 - arxiv.org

We investigate continued pretraining of LLMs for language adaptation on a tight academic
budget: a setting in which only a few GPUs can be used in parallel, for a heavily constrained …

[PDF] arxiv.org

Adaptive loss scaling for mixed precision training

R Zhao, B Vogel, T Ahmed - arXiv preprint arXiv:1910.12385, 2019 - arxiv.org

Mixed precision training (MPT) is becoming a practical technique to improve the speed and
energy efficiency of training deep neural networks by leveraging the fast hardware support …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

The hidden power of pure 16-bit floating-point neural networks

J Yun, B Kang, Z Fu - arXiv preprint arXiv:2301.12809, 2023 - arxiv.org

Lowering the precision of neural networks from the prevalent 32-bit precision has long been
considered harmful to performance, despite the gain in space and time. Many works …

被引用次数：1 相关文章所有 2 个版本

[图书][B] Number Systems for Deep Neural Network Architectures

G Alsuhli, V Sakellariou, H Saleh, M Al-Qutayri… - 2023 - Springer

In this introductory chapter, we provide an overview of the main topics covered in this book
and the motivations to write it. The importance of efficient number systems for Deep Neural …

被引用次数：1 相关文章所有 3 个版本

[PDF] osti.gov

MultiPosits: Universal Coding of

P Lindstrom - Conference on Next Generation Arithmetic, 2022 - Springer

Recently proposed real-number representations like Posits and Elias codes provide
attractive alternatives to IEEE floating point for representing real numbers in science and …

被引用次数：2 相关文章所有 4 个版本

[PDF] ic.ac.uk

On the challenges in programming mixed-precision deep neural networks

R Zhao, W Luk, C Xiong, X Niu, KH Tsoi - Proceedings of the 4th ACM …, 2020 - dl.acm.org

Deep Neural Networks (DNNs) are resilient to reduced data precision, which motivates
exploiting low-precision data formats for more efficient computation, especially on custom …

被引用次数：5 相关文章所有 2 个版本

高级搜索

QQ 群