Exploring multi-dimensional hierarchical network topologies for efficient distributed training...

文章

学术资源搜索

获得 3 条结果（用时0.01秒）

我的图书馆

Exploring multi-dimensional hierarchical network topologies for efficient distributed training...

在引用文章中搜索

[PDF] ieee.org

Peta-scale embedded photonics architecture for distributed deep learning applications

Z Wu, LY Dai, A Novick, M Glick, Z Zhu… - Journal of Lightwave …, 2023 - ieeexplore.ieee.org

As Deep Learning (DL) models grow larger and more complex, training jobs are
increasingly distributed across multiple Computing Units (CU) such as GPUs and TPUs …

被引用次数：9 相关文章所有 8 个版本

[PDF] arxiv.org

Impact of RoCE congestion control policies on distributed training of DNNs

T Khan, S Rashidi, S Sridharan… - … IEEE Symposium on …, 2022 - ieeexplore.ieee.org

Ahstract-RDMA over Converged Ethernet (RoCE) has gained significant attraction for
datacenter networks due to its compatibility with conventional Ethernet-based fabric …

被引用次数：8 相关文章所有 7 个版本

[PDF] arxiv.org

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning

W Won, M Elavazhagan, S Srinivasan, A Durg… - arXiv preprint arXiv …, 2023 - arxiv.org

The surge of artificial intelligence, specifically large language models, has led to a rapid
advent towards the development of large-scale machine learning training clusters …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群

Exploring multi-dimensional hierarchical network topologies for efficient distributed training...

Peta-scale embedded photonics architecture for distributed deep learning applications

Impact of RoCE congestion control policies on distributed training of DNNs

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Machine Learning

引用