Wrht: Efficient all-reduce for distributed DNN training in optical interconnect systems

F Dai, Y Chen, Z Huang, H Zhang - Proceedings of the 52nd …, 2023 - dl.acm.org
Communication efficiency is crucial for accelerating distributed deep neural network (DNN)
training. All-reduce, a vital communication primitive, is responsible for reducing model …

Modoru: Clos nanosecond optical switching for distributed deep training

C Wang, N Yoshikane, D Elson… - Journal of Optical …, 2023 - ieeexplore.ieee.org
Distributed deep training has become a significant consumer of bandwidth across
datacenter-scale networks. The diverse parallel strategies employed in deep training require …

Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks

CWF Parsonson, Z Shabka, A Ottino… - arXiv preprint arXiv …, 2023 - arxiv.org
From natural language processing to genome sequencing, large-scale machine learning
models are bringing advances to a broad range of fields. Many of these models are too large …

Performance Comparison of Distributed DNN Training on Optical Versus Electrical Interconnect Systems

F Dai, Y Chen, Z Huang, H Zhang, H Tian - International Conference on …, 2023 - Springer
Abstract Parallel and distributed Deep Neural Network (DNN) training have become integral
in data centers, significantly reducing DNN training time. The interconnection type among …

Optimisation for Optical Data Centre Switching and Networking with Artificial Intelligence

Z Shabka - 2023 - discovery.ucl.ac.uk
Cloud and cluster computing platforms have become standard across almost every domain
of business, and their scale quickly approaches $\mathbf {O}(10^ 6) $ servers in a single …

[PDF][PDF] Accelerating Deep Neural Network Training on Optical Interconnect Systems

F Dai - 2023 - ourarchive.otago.ac.nz
As deep learning (DL) algorithms evolve and data volumes expand, training deep neural
networks (DNNs) has become essential across various domains, delivering unprecedented …

Design and Evaluation of Demand-and Topology Reconfiguration-aware Networks

JPD Zerwas - 2023 - mediatum.ub.tum.de
This thesis investigates demand-and topological reconfiguration-awareness in
communication networks on a macroscopic level along three dimensions. It presents …