Accelerated training for cnn distributed deep learning through automatic resource-aware layer...

R Mayer, HA Jacobsen - ACM Computing Surveys (CSUR), 2020 - dl.acm.org

Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-
art results in various domains, such as image recognition and natural language processing …

被引用次数：238 相关文章所有 8 个版本

[PDF] github.io

A survey of techniques for optimizing deep learning on GPUs

S Mittal, S Vaishay - Journal of Systems Architecture, 2019 - Elsevier

The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …

被引用次数：191 相关文章所有 5 个版本

[PDF] ieee.org

Liquid: Intelligent resource estimation and network-efficient scheduling for deep learning jobs on distributed GPU clusters

R Gu, Y Chen, S Liu, H Dai, G Chen… - … on Parallel and …, 2021 - ieeexplore.ieee.org

Deep learning (DL) is becoming increasingly popular in many domains, including computer
vision, speech recognition, self-driving automobiles, etc. GPU can train DL models efficiently …

被引用次数：65 相关文章所有 4 个版本

[PDF] ieee.org

A review on community detection in large complex networks from conventional to deep learning methods: A call for the use of parallel meta-heuristic algorithms

MN Al-Andoli, SC Tan, WP Cheah, SY Tan - IEEE Access, 2021 - ieeexplore.ieee.org

Complex networks (CNs) have gained much attention in recent years due to their
importance and popularity. The rapid growth in the size of CNs leads to more difficulties in …

被引用次数：31 相关文章所有 4 个版本

[PDF] github.io

Optimizing distributed training deployment in heterogeneous GPU clusters

X Yi, S Zhang, Z Luo, G Long, L Diao, C Wu… - Proceedings of the 16th …, 2020 - dl.acm.org

This paper proposes HeteroG, an automatic module to accelerate deep neural network
training in heterogeneous GPU clusters. To train a deep learning model with large amounts …

被引用次数：27 相关文章所有 6 个版本

[PDF] hku.hk

Fast training of deep learning models over multiple gpus

X Yi, Z Luo, C Meng, M Wang, G Long, C Wu… - Proceedings of the 21st …, 2020 - dl.acm.org

This paper proposes FastT, a transparent module to work with the TensorFlow framework for
automatically identifying a satisfying deployment and execution order of operations in DNN …

被引用次数：18 相关文章所有 5 个版本

[PDF] osti.gov

Marble: A multi-gpu aware job scheduler for deep learning on hpc systems

J Han, MM Rafique, L Xu, AR Butt… - 2020 20th IEEE/ACM …, 2020 - ieeexplore.ieee.org

Deep learning (DL) has become a key tool for solving complex scientific problems. However,
managing the multi-dimensional large-scale data associated with DL, especially atop extant …

被引用次数：19 相关文章所有 6 个版本

Garfield: System support for byzantine machine learning (regular paper)

R Guerraoui, A Guirguis, J Plassmann… - 2021 51st Annual …, 2021 - ieeexplore.ieee.org

We present GARFIELD, a library to transparently make machine learning (ML) applications,
initially built with popular (but fragile) frameworks, eg, TensorFlow and PyTorch, Byzantine …

被引用次数：16 相关文章所有 3 个版本

[PDF] google.com

PSNet: Reconfigurable network topology design for accelerating parameter server architecture based distributed machine learning

L Liu, Q Jin, D Wang, H Yu, G Sun, S Luo - Future Generation Computer …, 2020 - Elsevier

Abstract The bottleneck of Distributed Machine Learning (DML) has shifted from computation
to communication. Lots of works have focused on speeding up communication phase from …

被引用次数：16 相关文章所有 3 个版本

[PDF] google.com

Online job scheduling for distributed machine learning in optical circuit switch networks

L Liu, H Yu, G Sun, H Zhou, Z Li, S Luo - Knowledge-Based Systems, 2020 - Elsevier

Networking has become a well-known performance bottleneck for distributed machine
learning (DML). Although lots of works have focused on accelerating the communication …

被引用次数：13 相关文章所有 2 个版本

高级搜索

QQ 群