S Gan,
X Lian, R Wang, J Chang, C Liu, H Shi… - arXiv preprint arXiv …, 2021 - arxiv.org
Recent years have witnessed a growing list of systems for distributed data-parallel training.
Existing systems largely fit into two paradigms, ie, parameter server and MPI-style collective …