A single-shot generalized device placement for large dataflow graphs

H Lan, L Chen, B Li - 2021 IEEE International Parallel and …, 2021 - ieeexplore.ieee.org

Advanced deep neural networks with large sizes are usually trained on a mixture of devices,
including multiple CPUs and GPUs. The model training speed and efficiency are drastically …

被引用次数：6 相关文章所有 3 个版本

[PDF] researchgate.net

Mercury: Fast and Optimal Device Placement for Large Deep Learning Models

H Xu, P Zhou, H Xie, Y Liao - … of the 52nd International Conference on …, 2023 - dl.acm.org

The rapidly expanding neural network models are becoming increasingly challenging to run
on a single device. Hence, model parallelism over multiple devices is critical to …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Celeritas: Fast Optimizer for Large Dataflow Graphs

H Xu, Y Liao, H Xie, P Zhou - arXiv preprint arXiv:2208.00184, 2022 - arxiv.org

The rapidly enlarging neural network models are becoming increasingly challenging to run
on a single device. Hence model parallelism over multiple devices is critical to guarantee …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Accelerate model parallel training by using efficient graph traversal order in device placement

T Wang, AH Payberah, DH Hagos… - arXiv preprint arXiv …, 2022 - arxiv.org

Modern neural networks require long training to reach decent performance on massive
datasets. One common approach to speed up training is model parallelization, where large …

被引用次数：3 相关文章所有 3 个版本

[PDF] utoronto.ca

Device Placement Optimization with Deep Reinforcement Learning

H Lan - 2023 - search.proquest.com

With the proliferation of machine learning, deep neural networks (DNNs) have become
ubiquitous in various real-world applications, and their sizes have become increasingly …

Accelerate Model Parallel Deep Learning Training Using Effective Graph Traversal Order in Device Placement

T Wang, AH Payberah, DH Hagos… - … Conference on Distributed …, 2022 - Springer

Modern neural networks require long training to reach decent performance on massive
datasets. One common approach to speed up training is model parallelization, where large …

被引用次数：1 相关文章所有 6 个版本

高级搜索

QQ 群