Communication-efficient edge AI: Algorithms and systems

Y Shi, K Yang, T Jiang, J Zhang… - … Surveys & Tutorials, 2020 - ieeexplore.ieee.org
Artificial intelligence (AI) has achieved remarkable breakthroughs in a wide range of fields,
ranging from speech processing, image classification to drug discovery. This is driven by the …

A survey of predictive maintenance: Systems, purposes and approaches

Y Ran, X Zhou, P Lin, Y Wen, R Deng - arXiv preprint arXiv:1912.07383, 2019 - arxiv.org
This paper provides a comprehensive literature review on Predictive Maintenance (PdM)
with emphasis on system architectures, purposes and approaches. In industry, any outages …

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer
In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

Mlperf training benchmark

P Mattson, C Cheng, G Diamos… - Proceedings of …, 2020 - proceedings.mlsys.org
Abstract Machine learning is experiencing an explosion of software and hardware solutions,
and needs industry-standard performance benchmarks to drive design and enable …

Enable deep learning on mobile devices: Methods, systems, and applications

H Cai, J Lin, Y Lin, Z Liu, H Tang, H Wang… - ACM Transactions on …, 2022 - dl.acm.org
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial
intelligence (AI), including computer vision, natural language processing, and speech …

FFCV: Accelerating training by removing data bottlenecks

G Leclerc, A Ilyas, L Engstrom… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present FFCV, a library for easy, fast, resource-efficient training of machine learning
models. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from …

A survey of techniques for optimizing deep learning on GPUs

S Mittal, S Vaishay - Journal of Systems Architecture, 2019 - Elsevier
The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …

Delayed gradient averaging: Tolerate the communication latency for federated learning

L Zhu, H Lin, Y Lu, Y Lin, S Han - Advances in Neural …, 2021 - proceedings.neurips.cc
Federated Learning is an emerging direction in distributed machine learning that en-ables
jointly training a model without sharing the data. Since the data is distributed across many …

SiP-ML: high-bandwidth optical network interconnects for machine learning training

M Khani, M Ghobadi, M Alizadeh, Z Zhu… - Proceedings of the …, 2021 - dl.acm.org
This paper proposes optical network interconnects as a key enabler for building high-
bandwidth ML training clusters with strong scaling properties. Our design, called SiP-ML …

{CASSINI}:{Network-Aware} Job Scheduling in Machine Learning Clusters

S Rajasekaran, M Ghobadi, A Akella - 21st USENIX Symposium on …, 2024 - usenix.org
We present CASSINI, a network-aware job scheduler for machine learning (ML) clusters.
CASSINI introduces a novel geometric abstraction to consider the communication pattern of …