作者
Zhuang Wang, Haibin Lin, Yibo Zhu, TS Eugene Ng
发表日期
2023/5/8
图书
Proceedings of the Eighteenth European Conference on Computer Systems
页码范围
867-882
简介
Gradient compression (GC) is a promising approach to addressing the communication bottleneck in distributed deep learning (DDL). It saves the communication time, but also incurs additional computation overheads. The training throughput of compression-enabled DDL is determined by the compression strategy, including whether to compress each tensor, the type of compute resources (e.g., CPUs or GPUs) for compression, the communication schemes for compressed tensor, and so on. However, it is challenging to find the optimal compression strategy for applying GC to DDL because of the intricate interactions among tensors. To fully unleash the benefits of GC, two questions must be addressed: 1) How to express any compression strategies and the corresponding interactions among tensors of any DDL training job? 2) How to quickly select a near-optimal compression strategy?
In this paper, we propose …
引用总数
学术搜索中的文章