F Liang, Z Zhang, H Lu, V Leung, Y Guo… - arXiv preprint arXiv …, 2024 - arxiv.org
With the rapid growth in the volume of data sets, models, and devices in the domain of deep learning, there is increasing attention on large-scale distributed deep learning. In contrast to …
Z Wang, X Wu, Z Xu, TS Ng - Proceedings of Machine …, 2023 - proceedings.mlsys.org
Data-parallel distributed training (DDT) is the de facto way to accelerate deep learning on multiple GPUs. In DDT, communication for gradient synchronization is the major efficiency …
Large deep learning models have recently garnered substantial attention from both academia and industry. Nonetheless, frequent failures are observed during large model …
Activation compressed training provides a solution towards reducing the memory cost of training deep neural networks (DNNs). However, state-of-the-art work combines a search of …
Z Wang, Z Xu, A Shrivastava, TS Ng - arXiv preprint arXiv:2309.13254, 2023 - arxiv.org
Distributed training is the de facto standard to scale up the training of Deep Neural Networks (DNNs) with multiple GPUs. The performance bottleneck of distributed training lies in …
Z Wang, XC Wu, Z Xu, TSE Ng - Proceedings of the Sixth Conference on …, 2023 - cs.rice.edu
Data-parallel distributed training (DDT) is the de facto way to accelerate deep learning on multiple GPUs. In DDT, communication for gradient synchronization is the major efficiency …
H Liu, Z Liu, K Zhou, T Zhao, N Shah, X Hu - 2023 - openreview.net
Graph neural networks (GNNs) have gained considerable success in graph-based learning tasks, yet training GNNs on large graphs is still inefficient. The root cause is the graph-based …
Deep neural networks (DNNs) have achieved unparalleled performance in numerous fields, including computer vision, natural language processing, and recommendation systems …