相关文章- 学术资源搜索

Eflops: Algorithm and system co-design for a high performance distributed training platform

J Dong, Z Cao, T Zhang, J Ye, S Wang… - … Symposium on High …, 2020 - ieeexplore.ieee.org

Deep neural networks (DNNs) have gained tremendous attractions as compelling solutions
for applications such as image classification, object detection, speech recognition, and so …

被引用次数：53 相关文章所有 2 个版本

[PDF] github.io

An in-depth analysis of distributed training of deep neural networks

Y Ko, K Choi, J Seo, SW Kim - 2021 IEEE International Parallel …, 2021 - ieeexplore.ieee.org

As the popularity of deep learning in industry rapidly grows, efficient training of deep neural
networks (DNNs) becomes important. To train a DNN with a large amount of data, distributed …

被引用次数：15 相关文章所有 5 个版本

HPDL: towards a general framework for high-performance distributed deep learning

D Li, Z Lai, K Ge, Y Zhang, Z Zhang… - 2019 IEEE 39th …, 2019 - ieeexplore.ieee.org

With growing scale of the data volume and neural network size, we have come into the era
of distributed deep learning. High-performance training and inference on distributed …

被引用次数：19 相关文章所有 2 个版本

[PDF] nsf.gov

A network-centric hardware/algorithm co-design to accelerate distributed training of deep neural networks

Y Li, J Park, M Alian, Y Yuan, Z Qu… - 2018 51st Annual …, 2018 - ieeexplore.ieee.org

Training real-world Deep Neural Networks (DNNs) can take an eon (ie, weeks or months)
without leveraging distributed systems. Even distributed training takes inordinate time, of …

被引用次数：99 相关文章所有 12 个版本

Gradientflow: Optimizing network performance for large-scale distributed dnn training

P Sun, Y Wen, R Han, W Feng… - IEEE Transactions on Big …, 2019 - ieeexplore.ieee.org

It is important to scale out deep neural network (DNN) training for reducing model training
time. The high communication overhead is one of the major performance bottlenecks for …

被引用次数：29 相关文章所有 2 个版本

[PDF] arxiv.org

Optimizing network performance for distributed dnn training on gpu clusters: Imagenet/alexnet training in 1.5 minutes

P Sun, W Feng, R Han, S Yan, Y Wen - arXiv preprint arXiv:1902.06855, 2019 - arxiv.org

It is important to scale out deep neural network (DNN) training for reducing model training
time. The high communication overhead is one of the major performance bottlenecks for …

被引用次数：78 相关文章所有 2 个版本

[PDF] arxiv.org

Accelerated training for cnn distributed deep learning through automatic resource-aware layer placement

JH Park, S Kim, J Lee, M Jeon, SH Noh - arXiv preprint arXiv:1901.05803, 2019 - arxiv.org

The Convolutional Neural Network (CNN) model, often used for image classification,
requires significant training time to obtain high accuracy. To this end, distributed training is …

被引用次数：22 相关文章所有 2 个版本

[PDF] arxiv.org

Enabling compute-communication overlap in distributed deep learning training platforms

S Rashidi, M Denton, S Sridharan… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Deep Learning (DL) training platforms are built by interconnecting multiple DL accelerators
(eg, GPU/TPU) via fast, customized interconnects with 100s of gigabytes (GBs) of bandwidth …

被引用次数：38 相关文章所有 6 个版本

[PDF] github.io

Cynthia: Cost-efficient cloud resource provisioning for predictable distributed deep neural network training

H Zheng, F Xu, L Chen, Z Zhou, F Liu - Proceedings of the 48th …, 2019 - dl.acm.org

It becomes an increasingly popular trend for deep neural networks with large-scale datasets
to be trained in a distributed manner in the cloud. However, widely known as resource …

被引用次数：39 相关文章所有 2 个版本

Parallel and distributed training of deep neural networks: A brief overview

A Farkas, G Kertész, R Lovas - 2020 IEEE 24th International …, 2020 - ieeexplore.ieee.org

Deep neural networks and deep learning are becoming important and popular techniques in
modern services and applications. The training of these networks is computationally …

被引用次数：23 相关文章所有 3 个版本

高级搜索

QQ 群

Eflops: Algorithm and system co-design for a high performance distributed training platform

An in-depth analysis of distributed training of deep neural networks

HPDL: towards a general framework for high-performance distributed deep learning

A network-centric hardware/algorithm co-design to accelerate distributed training of deep neural networks

Gradientflow: Optimizing network performance for large-scale distributed dnn training

Optimizing network performance for distributed dnn training on gpu clusters: Imagenet/alexnet training in 1.5 minutes

Accelerated training for cnn distributed deep learning through automatic resource-aware layer placement

Enabling compute-communication overlap in distributed deep learning training platforms

Cynthia: Cost-efficient cloud resource provisioning for predictable distributed deep neural network training

Parallel and distributed training of deep neural networks: A brief overview

相关搜索

引用