Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

F Liang, Z Zhang, H Lu, C Li, V Leung, Y Guo… - arXiv preprint arXiv …, 2024 - arxiv.org
With rapidly increasing distributed deep learning workloads in large-scale data centers,
efficient distributed deep learning framework strategies for resource allocation and workload …