Strads: A distributed framework for scheduled model parallel machine learning

R Mayer, HA Jacobsen - ACM Computing Surveys (CSUR), 2020 - dl.acm.org

Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-
art results in various domains, such as image recognition and natural language processing …

被引用次数：258 相关文章所有 8 个版本

[PDF] arxiv.org

Orchestrating the development lifecycle of machine learning-based IoT applications: A taxonomy and survey

B Qian, J Su, Z Wen, DN Jha, Y Li, Y Guan… - ACM Computing …, 2020 - dl.acm.org

Machine Learning (ML) and Internet of Things (IoT) are complementary advances: ML
techniques unlock the potential of IoT with intelligence, and IoT applications increasingly …

被引用次数：103 相关文章所有 11 个版本

[PDF] mdpi.com

Privacy preserving machine learning with homomorphic encryption and federated learning

H Fang, Q Qian - Future Internet, 2021 - mdpi.com

Privacy protection has been an important concern with the great success of machine
learning. In this paper, it proposes a multi-party privacy preserving machine learning …

被引用次数：383 相关文章所有 7 个版本

[PDF] kaust.edu.sa

Optimus: an efficient dynamic resource scheduler for deep learning clusters

Y Peng, Y Bao, Y Chen, C Wu, C Guo - Proceedings of the Thirteenth …, 2018 - dl.acm.org

Deep learning workloads are common in today's production clusters due to the proliferation
of deep learning driven AI services (eg, speech recognition, machine translation). A deep …

被引用次数：531 相关文章所有 3 个版本

[PDF] usenix.org

Gaia:{Geo-Distributed} machine learning approaching {LAN} speeds

K Hsieh, A Harlap, N Vijaykumar, D Konomis… - … USENIX Symposium on …, 2017 - usenix.org

Machine learning (ML) is widely used to derive useful information from large-scale data
(such as user activities, pictures, and videos) generated at increasingly rapid rates, all over …

被引用次数：547 相关文章所有 23 个版本

[PDF] arxiv.org

Pipedream: Fast and efficient pipeline parallel dnn training

A Harlap, D Narayanan, A Phanishayee… - arXiv preprint arXiv …, 2018 - arxiv.org

PipeDream is a Deep Neural Network (DNN) training system for GPUs that parallelizes
computation by pipelining execution across multiple machines. Its pipeline parallel …

被引用次数：288 相关文章所有 10 个版本

[PDF] usenix.org

{HetPipe}: Enabling large {DNN} training on (whimpy) heterogeneous {GPU} clusters through integration of pipelined model parallelism and data parallelism

JH Park, G Yun, MY Chang, NT Nguyen, S Lee… - 2020 USENIX Annual …, 2020 - usenix.org

Deep Neural Network (DNN) models have continuously been growing in size in order to
improve the accuracy and quality of the models. Moreover, for training of large DNN models …

被引用次数：149 相关文章所有 10 个版本

[PDF] acm.org

Ppdsparse: A parallel primal-dual sparse method for extreme classification

IEH Yen, X Huang, W Dai, P Ravikumar… - Proceedings of the 23rd …, 2017 - dl.acm.org

Extreme Classification comprises multi-class or multi-label prediction where there is a large
number of classes, and is increasingly relevant to many real-world applications such as text …

被引用次数：165 相关文章所有 7 个版本

[PDF] arxiv.org

HET: scaling out huge embedding model training via cache-enabled distributed framework

X Miao, H Zhang, Y Shi, X Nie, Z Yang, Y Tao… - arXiv preprint arXiv …, 2021 - arxiv.org

Embedding models have been an effective learning paradigm for high-dimensional data.
However, one open issue of embedding models is that their representations (latent factors) …

被引用次数：53 相关文章所有 6 个版本

[PDF] acm.org

Supporting very large models using automatic dataflow graph partitioning

M Wang, C Huang, J Li - … of the Fourteenth EuroSys Conference 2019, 2019 - dl.acm.org

This paper presents Tofu, a system that partitions very large DNN models across multiple
GPU devices to reduce per-GPU memory footprint. Tofu is designed to partition a dataflow …

被引用次数：190 相关文章所有 9 个版本

高级搜索

QQ 群