T Huang, LUO Tao, JT Zhou - 2020 IEEE 40th International …, 2020 - ieeexplore.ieee.org
Learn in-situ is a growing trend for Edge AI. Training deep neural network (DNN) on edge devices is challenging because both energy and memory are constrained. Low precision …
P Hao, Y Zhang - 2021 IEEE/ACM Symposium on Edge …, 2021 - ieeexplore.ieee.org
This paper investigates the problem of performing distributed deep learning (DDL) to train machine learning (ML) models at the edge with resource-constrained embedded devices …
M Li, W Xiao, B Sun, H Zhao, H Yang, S Ren… - arXiv preprint arXiv …, 2022 - arxiv.org
Distributed synchronized GPU training is commonly used for deep learning. The resource constraint of using fixed GPUs makes large-scale deep learning training jobs suffer, and …
L Hu, J Zhu, Z Zhou, R Cheng, X Bai… - arXiv preprint arXiv …, 2021 - arxiv.org
Cloud training platforms, such as Amazon Web Services and Huawei Cloud provide users with computational resources to train their deep learning jobs. Elastic training is a service …
Deep Neural Networks (DNNs) have advanced the state-of-the-art in a variety of machine learning tasks and are deployed in increasing numbers of products and services. However …
CC Yang, G Cong - 2019 IEEE 26th International Conference …, 2019 - ieeexplore.ieee.org
Data loading can dominate deep neural network training time on large-scale systems. We present a comprehensive study on accelerating data loading performance in large-scale …
Deep Learning (DL) training platforms are built by interconnecting multiple DL accelerators (eg, GPU/TPU) via fast, customized interconnects with 100s of gigabytes (GBs) of bandwidth …
L Ibraimi, M Selimi, F Freitag - 2021 IEEE Global …, 2021 - ieeexplore.ieee.org
Inference with trained machine learning models is now pos-sible with small computing devices while only a few years ago it was run mostly in the cloud only. The recent technique …
S Li, RJ Walls, L Xu, T Guo - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Distributed training frameworks, like TensorFlow, have been proposed as a means to reduce the training time of deep learning models by using a cluster of GPU servers. While such …