查看文章

neurips.cc 中的 [PDF]

Cser: Communication-efficient sgd with error reset

作者

Cong Xie, Shuai Zheng, Sanmi Koyejo, Indranil Gupta, Mu Li, Haibin Lin

发表日期

2020

期刊

Advances in Neural Information Processing Systems

卷号

页码范围

12593-12603

简介

The scalability of Distributed Stochastic Gradient Descent (SGD) is today limited by communication bottlenecks. We propose a novel SGD variant:\underline {C} ommunication-efficient\underline {S} GD with\underline {E} rror\underline {R} eset, or\underline {CSER}. The key idea in CSER is first a new technique called``error reset''that adapts arbitrary compressors for SGD, producing bifurcated local models with periodic reset of resulting local residual errors. Second we introduce partial synchronization for both the gradients and the models, leveraging advantages from them. We prove the convergence of CSER for smooth non-convex problems. Empirical results show that when combined with highly aggressive compressors, the CSER algorithms accelerate the distributed training by nearly for CIFAR-100, and by for ImageNet.

引用总数

被引用次数：39

2019202020212022202320241 1 8 9 12 8

学术搜索中的文章

Cser: Communication-efficient sgd with error reset

C Xie, S Zheng, S Koyejo, I Gupta, M Li, H Lin - Advances in Neural Information Processing Systems, 2020

被引用次数：39 相关文章所有 11 个版本