Fathom: Reference workloads for modern deep learning methods

R Adolf, S Rama, B Reagen, GY Wei… - … on Workload …, 2016 - ieeexplore.ieee.org
… models, comparing the performance differences between training and inference, and … •
We demonstrate the need for a broader variety of deep learning workloads by looking at the …

Characterizing deep learning training workloads on alibaba-pai

M Wang, C Meng, G Long, C Wu… - … on workload …, 2019 - ieeexplore.ieee.org
… of these workloads, and more importantly, the training performance … , we characterize deep
learning training workloads from … of various workloads using different training architectures, to …

An efficient deep learning model to predict cloud workload for industry informatics

Q Zhang, LT Yang, Z Yan, Z Chen… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
learning model often includes a great number of parameters. In this paper, an efficient
deep learning … is proposed to predict the cloud workload for industry informatics. In the proposed …

[PDF][PDF] Multi-tenant GPU clusters for deep learning workloads: Analysis and implications

M Jeon, S Venkataraman, J Qian… - Technical report …, 2018 - microsoft.com
… In this paper we present a detailed workload characterization … cluster utilization for DNN
training workloads on multi-tenant … , and (3) failures during training. Based on our experience …

Taming unbalanced training workloads in deep learning with partial collective operations

S Li, T Ben-Nun, SD Girolamo, D Alistarh… - Proceedings of the 25th …, 2020 - dl.acm.org
Load imbalance pervasively exists in distributed deep learning training systems, either
caused by the inherent imbalance in learned tasks or by the system itself. Traditional …

AutoSched: An Adaptive Self-configured Framework for Scheduling Deep Learning Training Workloads

W Gao, X Zhang, S Huang, S Guo, P Sun… - Proceedings of the 38th …, 2024 - dl.acm.org
… -arrival workloads. In detail, we analyze the historical workloads in the workload repository
… , and then adopt FFT to extract the periodic workload submission. To generate future-arrival …

[PDF][PDF] Accelerating deep learning workloads through efficient multi-model execution

D Narayanan, K Santhanam… - … Machine Learning, 2018 - people.eecs.berkeley.edu
… However, many multi-model workloads are unable to … optimize multi-model deep learning
workloads. HiveMind optimizes a “… tuning and multi-model inference workloads by up to 10× on …

DLUX: A LUT-based near-bank accelerator for data center deep learning training workloads

P Gu, X Xie, S Li, D Niu, H Zheng… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
… Second, many of these DNN training workloads are … of these workloads, we conduct a case
study on a Tesla V100 GPU [4] using six representative DNN training data center workloads […

Characterization and prediction of deep learning workloads in large-scale gpu datacenters

Q Hu, P Sun, S Yan, Y Wen, T Zhang - Proceedings of the International …, 2021 - dl.acm.org
… increase the diversity of job workloads to enable more general … to benefit the community
of deep learning systems. Based on the … Then we describe the DL workloads running in this …

{Heterogeneity-Aware} cluster scheduling policies for deep learning workloads

D Narayanan, K Santhanam, F Kazhamiaka… - … USENIX Symposium on …, 2020 - usenix.org
… on allocating heterogeneous resources for DNN training workloads, we believe that Gavel
can be used for nonDNN workloads as well. Other workloads that are amenable to GPU …