S Mittal, S Vaishay - Journal of Systems Architecture, 2019 - Elsevier
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to its unique features, the GPU continues to remain the most widely used accelerator for DL …
R Gu, Y Chen, S Liu, H Dai, G Chen… - … on Parallel and …, 2021 - ieeexplore.ieee.org
Deep learning (DL) is becoming increasingly popular in many domains, including computer vision, speech recognition, self-driving automobiles, etc. GPU can train DL models efficiently …
Complex networks (CNs) have gained much attention in recent years due to their importance and popularity. The rapid growth in the size of CNs leads to more difficulties in …
X Yi, S Zhang, Z Luo, G Long, L Diao, C Wu… - Proceedings of the 16th …, 2020 - dl.acm.org
This paper proposes HeteroG, an automatic module to accelerate deep neural network training in heterogeneous GPU clusters. To train a deep learning model with large amounts …
This paper proposes FastT, a transparent module to work with the TensorFlow framework for automatically identifying a satisfying deployment and execution order of operations in DNN …
Deep learning (DL) has become a key tool for solving complex scientific problems. However, managing the multi-dimensional large-scale data associated with DL, especially atop extant …
We present GARFIELD, a library to transparently make machine learning (ML) applications, initially built with popular (but fragile) frameworks, eg, TensorFlow and PyTorch, Byzantine …
L Liu, Q Jin, D Wang, H Yu, G Sun, S Luo - Future Generation Computer …, 2020 - Elsevier
Abstract The bottleneck of Distributed Machine Learning (DML) has shifted from computation to communication. Lots of works have focused on speeding up communication phase from …
Networking has become a well-known performance bottleneck for distributed machine learning (DML). Although lots of works have focused on accelerating the communication …