Gradiveq: Vector quantization for bandwidth-efficient gradient aggregation in distributed...

L Deng, G Li, S Han, L Shi, Y Xie - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

被引用次数：981 相关文章所有 2 个版本

[PDF] arxiv.org

Communication-efficient edge AI: Algorithms and systems

Y Shi, K Yang, T Jiang, J Zhang… - … Surveys & Tutorials, 2020 - ieeexplore.ieee.org

Artificial intelligence (AI) has achieved remarkable breakthroughs in a wide range of fields,
ranging from speech processing, image classification to drug discovery. This is driven by the …

被引用次数：426 相关文章所有 8 个版本

[PDF] neurips.cc

PowerSGD: Practical low-rank gradient compression for distributed optimization

T Vogels, SP Karimireddy… - Advances in Neural …, 2019 - proceedings.neurips.cc

We study gradient compression methods to alleviate the communication bottleneck in data-
parallel distributed optimization. Despite the significant attention received, current …

被引用次数：368 相关文章所有 10 个版本

[PDF] kaust.edu.sa

Grace: A compressed communication framework for distributed machine learning

H Xu, CY Ho, AM Abdelmoniem, A Dutta… - 2021 IEEE 41st …, 2021 - ieeexplore.ieee.org

Powerful computer clusters are used nowadays to train complex deep neural networks
(DNN) on large datasets. Distributed training increasingly becomes communication bound …

被引用次数：102 相关文章所有 9 个版本

[PDF] arxiv.org

PipeGCN: Efficient full-graph training of graph convolutional networks with pipelined feature communication

C Wan, Y Li, CR Wolfe, A Kyrillidis, NS Kim… - arXiv preprint arXiv …, 2022 - arxiv.org

Graph Convolutional Networks (GCNs) is the state-of-the-art method for learning graph-
structured data, and training large-scale GCNs requires distributed training across multiple …

被引用次数：79 相关文章所有 7 个版本

[PDF] arxiv.org

A survey on large-scale machine learning

M Wang, W Fu, X He, S Hao… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

Machine learning can provide deep insights into data, allowing machines to make high-
quality predictions and having been widely used in real-world applications, such as text …

被引用次数：145 相关文章所有 4 个版本

[PDF] github.io

Accelerating distributed reinforcement learning with in-switch computing

Y Li, IJ Liu, Y Yuan, D Chen, A Schwing… - Proceedings of the 46th …, 2019 - dl.acm.org

Reinforcement learning (RL) has attracted much attention recently, as new and emerging AI-
based applications are demanding the capabilities to intelligently react to environment …

被引用次数：159 相关文章所有 9 个版本

[PDF] neurips.cc

Scalecom: Scalable sparsified gradient compression for communication-efficient distributed training

CY Chen, J Ni, S Lu, X Cui, PY Chen… - Advances in …, 2020 - proceedings.neurips.cc

Large-scale distributed training of Deep Neural Networks (DNNs) on state-of-the-art
platforms are expected to be severely communication constrained. To overcome this …

被引用次数：77 相关文章所有 6 个版本

[PDF] kaust.edu.sa

Compressed communication for distributed deep learning: Survey and quantitative evaluation

H Xu, CY Ho, AM Abdelmoniem, A Dutta, EH Bergou… - 2020 - repository.kaust.edu.sa

Powerful computer clusters are used nowadays to train complex deep neural networks
(DNN) on large datasets. Distributed training workloads increasingly become …

被引用次数：92 相关文章所有 3 个版本

[PDF] usenix.org

{Check-N-Run}: A checkpointing system for training deep learning recommendation models

A Eisenman, KK Matam, S Ingram, D Mudigere… - … USENIX Symposium on …, 2022 - usenix.org

Checkpoints play an important role in training long running machine learning (ML) models.
Checkpoints take a snapshot of an ML model and store it in a non-volatile memory so that …

被引用次数：78 相关文章所有 8 个版本

高级搜索

QQ 群