Optimizing network performance for distributed dnn training on gpu clusters: Imagenet/alexnet...

H Cai, J Lin, Y Lin, Z Liu, H Tang, H Wang… - ACM Transactions on …, 2022 - dl.acm.org

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial
intelligence (AI), including computer vision, natural language processing, and speech …

被引用次数：84 相关文章所有 6 个版本

[PDF] mlr.press

Dragonn: Distributed randomized approximate gradients of neural networks

Z Wang, Z Xu, X Wu, A Shrivastava… - … on Machine Learning, 2022 - proceedings.mlr.press

Data-parallel distributed training (DDT) has become the de-facto standard for accelerating
the training of most deep learning tasks on massively parallel hardware. In the DDT …

被引用次数：9 相关文章所有 6 个版本

A roadmap for big model

S Yuan, H Zhao, S Zhao, J Leng, Y Liang… - arXiv preprint arXiv …, 2022 - arxiv.org

With the rapid development of deep learning, training Big Models (BMs) for multiple
downstream tasks becomes a popular paradigm. Researchers have achieved various …

被引用次数：17 相关文章所有 2 个版本

[HTML] mdpi.com

[HTML][HTML] Fast fitting of the dynamic memdiode model to the conduction characteristics of RRAM devices using convolutional neural networks

FL Aguirre, E Piros, N Kaiser, T Vogel, S Petzold… - Micromachines, 2022 - mdpi.com

In this paper, the use of Artificial Neural Networks (ANNs) in the form of Convolutional
Neural Networks (AlexNET) for the fast and energy-efficient fitting of the Dynamic Memdiode …

被引用次数：5 相关文章所有 12 个版本

[HTML] mdpi.com

[HTML][HTML] Recognition of sago palm trees based on transfer learning

SMA Letsoin, RC Purwestri, F Rahmawan, D Herak - Remote Sensing, 2022 - mdpi.com

Sago palm tree, known as Metroxylon Sagu Rottb, is one of the priority commodities in
Indonesia. Based on our previous research, the potential habitat of the plant has been …

被引用次数：5 相关文章所有 8 个版本

CP-SGD: Distributed stochastic gradient descent with compression and periodic compensation

E Yu, D Dong, Y Xu, S Ouyang, X Liao - Journal of Parallel and Distributed …, 2022 - Elsevier

Communication overhead is the key challenge for distributed training. Gradient compression
is a widely used approach to reduce communication traffic. When combined with a parallel …

被引用次数：5 相关文章所有 2 个版本

Aperiodic local SGD: Beyond local SGD

H Zhang, T Wu, S Cheng, J Liu - … of the 51st International Conference on …, 2022 - dl.acm.org

Variations of stochastic gradient decedent (SGD) methods are at the core of training deep
neural network models. However, in distributed deep learning, where multiple computing …

被引用次数：6 相关文章

[PDF] mlsys.org

Virtualflow: Decoupling deep learning models from the underlying hardware

A Or, H Zhang, MN Freedman - Proceedings of Machine …, 2022 - proceedings.mlsys.org

We propose VirtualFlow, a system leveraging a novel abstraction called virtual node
processing to decouple the model from the hardware. In each step of training or inference …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Embrace: Accelerating sparse communication for distributed training of deep neural networks

S Li, Z Lai, D Li, Y Zhang, X Ye, Y Duan - Proceedings of the 51st …, 2022 - dl.acm.org

Distributed data-parallel training has been widely adopted for deep neural network (DNN)
models. Although current deep learning (DL) frameworks scale well for dense models like …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Bytecomp: Revisiting gradient compression in distributed training

Z Wang, H Lin, Y Zhu, TS Ng - arXiv preprint arXiv:2205.14465, 2022 - arxiv.org

Gradient compression (GC) is a promising approach to addressing the communication
bottleneck in distributed deep learning (DDL). However, it is challenging to find the optimal …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群