Topologies in distributed machine learning: Comprehensive survey, recommendations and future directions

L Liu, P Zhou, G Sun, X Chen, T Wu, H Yu, M Guizani - Neurocomputing, 2024 - Elsevier
With the widespread use of distributed machine learning (DML), many IT companies have
established networks dedicated to DML. Different communication architectures of DML have …

PSNet: Reconfigurable network topology design for accelerating parameter server architecture based distributed machine learning

L Liu, Q Jin, D Wang, H Yu, G Sun, S Luo - Future Generation Computer …, 2020 - Elsevier
Abstract The bottleneck of Distributed Machine Learning (DML) has shifted from computation
to communication. Lots of works have focused on speeding up communication phase from …

Hardware-Software Co-design for Optimizing MPI Programs in Data Center Network

A Rahbar - 2021 - search.proquest.com
Abstract High Performance Computing (HPC) systems are critical. A single server/processor
cannot handle the heavy computation needs of today's applications. HPC systems are built …