查看文章

Scaling Deep Learning through Optimizing Data- and Management-Plane Communications

作者

Zhuang Wang

发表日期

2023/12

机构

Rice University

简介

The recording-breaking performance of deep neural networks (DNNs) has brought remarkable success to many domains, such as computer vision, natural language processing, and recommendation systems. However, the powerful DNNs come with significant computational complexity due to the increased training data and model sizes. Because the network bandwidth upgrade in GPU clouds has not kept pace with the improvements in the computation capacity of GPUs as well as the fast growth in dataset and model sizes, deep learning practitioners have struggled to efficiently scale up training DNNs. This thesis identifies and addresses research challenges in scaling distributed deep learning (DDL) by optimizing communications in both the data plane and the management plane.

学术搜索中的文章

Scaling Deep Learning through Optimizing Data-and Management-Plane Communications

Z Wang - 2023