Eagle: Expedited device placement with automatic grouping for large models

H Lan, L Chen, B Li - 2021 IEEE International Parallel and …, 2021 - ieeexplore.ieee.org
Advanced deep neural networks with large sizes are usually trained on a mixture of devices,
including multiple CPUs and GPUs. The model training speed and efficiency are drastically …

Mercury: Fast and Optimal Device Placement for Large Deep Learning Models

H Xu, P Zhou, H Xie, Y Liao - … of the 52nd International Conference on …, 2023 - dl.acm.org
The rapidly expanding neural network models are becoming increasingly challenging to run
on a single device. Hence, model parallelism over multiple devices is critical to …

Celeritas: Fast Optimizer for Large Dataflow Graphs

H Xu, Y Liao, H Xie, P Zhou - arXiv preprint arXiv:2208.00184, 2022 - arxiv.org
The rapidly enlarging neural network models are becoming increasingly challenging to run
on a single device. Hence model parallelism over multiple devices is critical to guarantee …

Accelerate model parallel training by using efficient graph traversal order in device placement

T Wang, AH Payberah, DH Hagos… - arXiv preprint arXiv …, 2022 - arxiv.org
Modern neural networks require long training to reach decent performance on massive
datasets. One common approach to speed up training is model parallelization, where large …

Device Placement Optimization with Deep Reinforcement Learning

H Lan - 2023 - search.proquest.com
With the proliferation of machine learning, deep neural networks (DNNs) have become
ubiquitous in various real-world applications, and their sizes have become increasingly …

Accelerate Model Parallel Deep Learning Training Using Effective Graph Traversal Order in Device Placement

T Wang, AH Payberah, DH Hagos… - … Conference on Distributed …, 2022 - Springer
Modern neural networks require long training to reach decent performance on massive
datasets. One common approach to speed up training is model parallelization, where large …