Performance modeling and evaluation of distributed deep learning frameworks on gpus

MR Islam, S Liu, X Wang, G Xu - Social Network Analysis and Mining, 2020 - Springer

Recently, the use of social networks such as Facebook, Twitter, and Sina Weibo has
become an inseparable part of our daily lives. It is considered as a convenient platform for …

被引用次数：232 相关文章所有 10 个版本

[PDF] arxiv.org

Distributed machine learning for wireless communication networks: Techniques, architectures, and applications

S Hu, X Chen, W Ni, E Hossain… - … Surveys & Tutorials, 2021 - ieeexplore.ieee.org

Distributed machine learning (DML) techniques, such as federated learning, partitioned
learning, and distributed reinforcement learning, have been increasingly applied to wireless …

被引用次数：99 相关文章所有 5 个版本

[PDF] arxiv.org

Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes

X Jia, S Song, W He, Y Wang, H Rong, F Zhou… - arXiv preprint arXiv …, 2018 - arxiv.org

Synchronized stochastic gradient descent (SGD) optimizers with data parallelism are widely
used in training large-scale deep neural networks. Although using larger mini-batch sizes …

被引用次数：427 相关文章所有 7 个版本

[PDF] amazonaws.com

The next generation of deep learning hardware: Analog computing

W Haensch, T Gokmen, R Puri - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org

Initially developed for gaming and 3-D rendering, graphics processing units (GPUs) were
recognized to be a good fit to accelerate deep learning training. Its simple mathematical …

被引用次数：183 相关文章所有 4 个版本

[PDF] mlsys.org

Priority-based parameter propagation for distributed DNN training

A Jayarajan, J Wei, G Gibson… - Proceedings of …, 2019 - proceedings.mlsys.org

Data parallel training is widely used for scaling distributed deep neural network (DNN)
training. However, the performance benefits are often limited by the communication-heavy …

被引用次数：173 相关文章所有 9 个版本

[PDF] arxiv.org

Communication-efficient distributed deep learning: A comprehensive survey

Z Tang, S Shi, W Wang, B Li, X Chu - arXiv preprint arXiv:2003.06307, 2020 - arxiv.org

Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …

被引用次数：121 相关文章所有 4 个版本

[PDF] arxiv.org

A distributed synchronous SGD algorithm with global top-k sparsification for low bandwidth networks

S Shi, Q Wang, K Zhao, Z Tang, Y Wang… - 2019 IEEE 39th …, 2019 - ieeexplore.ieee.org

Distributed synchronous stochastic gradient descent (S-SGD) with data parallelism has
been widely used in training large-scale deep neural networks (DNNs), but it typically …

被引用次数：143 相关文章所有 6 个版本

[PDF] neurips.cc

Distributed deep learning in open collaborations

M Diskin, A Bukhtiyarov, M Ryabinin… - Advances in …, 2021 - proceedings.neurips.cc

Modern deep learning applications require increasingly more compute to train state-of-the-
art models. To address this demand, large corporations and institutions use dedicated High …

被引用次数：44 相关文章所有 9 个版本

[PDF] arxiv.org

Carbonscaler: Leveraging cloud workload elasticity for optimizing carbon-efficiency

WA Hanafy, Q Liang, N Bashir, D Irwin… - Proceedings of the ACM …, 2023 - dl.acm.org

Cloud platforms are increasing their emphasis on sustainability and reducing their
operational carbon footprint. A common approach for reducing carbon emissions is to exploit …

被引用次数：18 相关文章所有 4 个版本

[PDF] ijcai.org

[PDF][PDF] A convergence analysis of distributed SGD with communication-efficient gradient sparsification.

S Shi, K Zhao, Q Wang, Z Tang, X Chu - IJCAI, 2019 - ijcai.org

Gradient sparsification is a promising technique to significantly reduce the communication
overhead in decentralized synchronous stochastic gradient descent (S-SGD) algorithms …

被引用次数：81 相关文章所有 5 个版本

高级搜索

QQ 群