Deep learning in mobile and wireless networking: A survey

C Zhang, P Patras, H Haddadi - IEEE Communications surveys …, 2019 - ieeexplore.ieee.org
The rapid uptake of mobile devices and the rising popularity of mobile applications and
services pose unprecedented demands on mobile and wireless networking infrastructure …

Demystifying parallel and distributed deep learning: An in-depth concurrency analysis

T Ben-Nun, T Hoefler - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …

Training deep neural networks with 8-bit floating point numbers

N Wang, J Choi, D Brand, CY Chen… - Advances in neural …, 2018 - proceedings.neurips.cc
The state-of-the-art hardware platforms for training deep neural networks are moving from
traditional single precision (32-bit) computations towards 16 bits of precision-in large part …

Deepchain: Auditable and privacy-preserving deep learning with blockchain-based incentive

J Weng, J Weng, J Zhang, M Li… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Deep learning can achieve higher accuracy than traditional machine learning algorithms in
a variety of machine learning tasks. Recently, privacy-preserving deep learning has drawn …

A review on deep learning models for forecasting time series data of solar irradiance and photovoltaic power

RA Rajagukguk, RAA Ramadhan, HJ Lee - Energies, 2020 - mdpi.com
Presently, deep learning models are an alternative solution for predicting solar energy
because of their accuracy. The present study reviews deep learning models for handling …

Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent

X Lian, C Zhang, H Zhang, CJ Hsieh… - Advances in neural …, 2017 - proceedings.neurips.cc
Most distributed machine learning systems nowadays, including TensorFlow and CNTK, are
built in a centralized fashion. One bottleneck of centralized algorithms lies on high …

Asynchronous decentralized parallel stochastic gradient descent

X Lian, W Zhang, C Zhang, J Liu - … Conference on Machine …, 2018 - proceedings.mlr.press
Most commonly used distributed machine learning systems are either synchronous or
centralized asynchronous. Synchronous algorithms like AllReduce-SGD perform poorly in a …

Fast algorithms for convolutional neural networks

A Lavin, S Gray - Proceedings of the IEEE conference on …, 2016 - cv-foundation.org
Deep convolutional neural networks take GPU-days of computation to train on large data
sets. Pedestrian detection for self driving cars requires very low latency. Image recognition …

Cooperative SGD: A unified framework for the design and analysis of local-update SGD algorithms

J Wang, G Joshi - Journal of Machine Learning Research, 2021 - jmlr.org
When training machine learning models using stochastic gradient descent (SGD) with a
large number of nodes or massive edge devices, the communication cost of synchronizing …

Adaptive communication strategies to achieve the best error-runtime trade-off in local-update SGD

J Wang, G Joshi - Proceedings of Machine Learning and …, 2019 - proceedings.mlsys.org
Large-scale machine learning training, in particular distributed stochastic gradient descent,
needs to be robust to inherent system variability such as node straggling and random …