Scavenger: A Cloud Service for Optimizing Cost and Performance of ML Training

S Tyagi, P Sharma - … Symposium on Cluster, Cloud and Internet …, 2023 - ieeexplore.ieee.org
Cloud computing platforms can provide the compu-tational resources required for training
large machine learning models such as deep neural networks. While the pay-as-you-go …

A Survey From Distributed Machine Learning to Distributed Deep Learning

M Dehghani, Z Yazdanparast - arXiv preprint arXiv:2307.05232, 2023 - arxiv.org
Artificial intelligence has achieved significant success in handling complex tasks in recent
years. This success is due to advances in machine learning algorithms and hardware …

Edgepipe: Tailoring pipeline parallelism with deep neural networks for volatile wireless edge devices

JY Yoon, Y Byeon, J Kim, HJ Lee - IEEE Internet of Things …, 2021 - ieeexplore.ieee.org
As intelligence recently moves to the edge to tackle the problems of privacy, scalability, and
network bandwidth in the centralized intelligence, it is necessary to construct an efficient yet …

ADS-CNN: Adaptive Dataflow Scheduling for lightweight CNN accelerator on FPGAs

Y Wan, X Xie, J Chen, K Xie, D Yi, Y Lu, K Gai - Future Generation …, 2024 - Elsevier
Lightweight convolutional neural networks (CNNs) enable lower inference latency and data
traffic, facilitating deployment on resource-constrained edge devices such as field …

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

F Ferrandi, S Curzel, L Fiorin, D Ielmini… - arXiv preprint arXiv …, 2023 - arxiv.org
In recent years, the field of Deep Learning has seen many disruptive and impactful
advancements. Given the increasing complexity of deep neural networks, the need for …

DGT: A contribution-aware differential gradient transmission mechanism for distributed machine learning

H Zhou, Z Li, Q Cai, H Yu, S Luo, L Luo… - Future Generation …, 2021 - Elsevier
Distributed machine learning is a mainstream system to learn insights for analytics and
intelligence services of many fronts (eg, health, streaming and business) from their massive …

Stochastic optimization with laggard data pipelines

N Agarwal, R Anil, T Koren… - Advances in Neural …, 2020 - proceedings.neurips.cc
State-of-the-art optimization is steadily shifting towards massively parallel pipelines with
extremely large batch sizes. As a consequence, CPU-bound preprocessing and …

Accelerating distributed deep learning using multi-path RDMA in data center networks

F Tian, Y Zhang, W Ye, C Jin, Z Wu… - Proceedings of the ACM …, 2021 - dl.acm.org
Data center networks (DCNs) have widely deployed RDMA to support data-intensive
applications such as machine learning. While DCNs are designed with rich multi-path …

[HTML][HTML] Audio-based anomaly detection on edge devices via self-supervision and spectral analysis

F Lo Scudo, E Ritacco, L Caroprese… - Journal of Intelligent …, 2023 - Springer
In real-world applications, audio surveillance is often performed by large models that can
detect many types of anomalies. However, typical approaches are based on centralized …

LayerPipe: Accelerating deep neural network training by intra-layer and inter-layer gradient pipelining and multiprocessor scheduling

NK Unnikrishnan, KK Parhi - 2021 IEEE/ACM International …, 2021 - ieeexplore.ieee.org
The time required for training the neural networks increases with size, complexity, and
depth. Training model parameters by backpropagation inherently creates feedback loops …