Although deep learning has made great progress in recent years, the exploding economic and environmental costs of training neural networks are becoming unsustainable. To …
Machine learning (ML) models are widely used in many important domains. For efficiently processing these computational-and memory-intensive applications, tensors of these …
H Xi, C Li, J Chen, J Zhu - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Quantizing the activation, weight, and gradient to 4-bit is promising to accelerate neural network training. However, existing 4-bit training methods require custom numerical formats …
Training graph neural networks (GNNs) is extremely time consuming because sparse graph- based operations are hard to be accelerated by community hardware. Prior art successfully …
As the model size grows rapidly, fine-tuning the large pre-trained language model has become increasingly difficult due to its extensive memory usage. Previous works usually …
Silicon-photonics-based optical neural network (ONN) is a promising hardware platform that could represent a paradigm shift in efficient AI with its CMOS-compatibility, flexibility, ultra …
U Haider, M Hanif, A Rashid, SF Hussain - Image and Vision Computing, 2023 - Elsevier
Convolutional networks (ConvNets) are computationally expensive but well known for their performance on image data. One way to reduce their complexity is to explore inherited data …
The successes of deep learning, variational inference, and many other fields have been aided by specialized implementations of reverse-mode automatic differentiation (AD) to …
L Ning, H Guan, X Shen - 2019 IEEE 35th International …, 2019 - ieeexplore.ieee.org
This work proposes adaptive deep reuse, a method for accelerating CNN training by identifying and avoiding the unnecessary computations contained in each specific training …