A study of BFLOAT16 for deep learning training

Resource-efficient convolutional networks: A survey on model-, arithmetic-, and implementation-level techniques

JK Lee, L Mukhanov, AS Molahosseini… - ACM Computing …, 2023 - dl.acm.org

Convolutional neural networks (CNNs) are used in our daily life, including self-driving cars,
virtual assistants, social network services, healthcare services, and face recognition, among …

被引用次数：23 相关文章所有 8 个版本

[PDF] acm.org

TAB: Unified and optimized ternary, binary, and mixed-precision neural network inference on the edge

S Zhu, LHK Duong, W Liu - ACM Transactions on Embedded Computing …, 2022 - dl.acm.org

Ternary Neural Networks (TNNs) and mixed-precision Ternary Binary Networks (TBNs) have
demonstrated higher accuracy compared to Binary Neural Networks (BNNs) while providing …

被引用次数：9 相关文章所有 4 个版本

[PDF] acm.org Full View

Highly Efficient Self-Checking Matrix Multiplication on Tiled AMX Accelerators

CS Mummidi, VC Ferreira, S Srinivasan… - ACM Transactions on …, 2024 - dl.acm.org

General Matrix Multiplication (GEMM) is a computationally expensive operation that is used
in many applications such as machine learning. Hardware accelerators are increasingly …

被引用次数：2 相关文章

[PDF] acm.org

Resource-demand estimation for edge tensor processing units

B Herzog, S Reif, J Hemp, T Hönig… - ACM Transactions on …, 2022 - dl.acm.org

Machine learning has shown tremendous success in a large variety of applications. The
evolution of machine-learning applications from cloud-based systems to mobile and …

被引用次数：7 相关文章所有 5 个版本

[PDF] mit.edu

Efficient Representation of Large-Alphabet Probability Distributions via Arcsinh-Compander

A Adler, J Tang, Y Polyanskiy - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

A number of engineering and scientific problems require representing and manipulating
probability distributions over large alphabets, which we may think of as long vectors of reals …

Accelerating neural network training using arbitrary precision approximating matrix multiplication algorithms

G Ballard, J Weissenberger, L Zhang - 50th International Conference on …, 2021 - dl.acm.org

Matrix multiplication is one of the bottleneck computations for training the weights within
deep neural networks. To speed up the training phase, we propose to use faster algorithms …

被引用次数：1 相关文章所有 3 个版本

[PDF] researchgate.net

[PDF][PDF] Applications in Energy and Combustion Science

A Haridas, NR Vadlamani, Y Minamoto - researchgate.net

Information loss in numerical physics simulations can arise from various sources when
solving discretised partial differential equations. In particular, errors related to numerical …

高级搜索

QQ 群