Distributed hierarchical gpu parameter server for massive scale deep learning ads systems

CJ Wu, R Raghavendra, U Gupta… - Proceedings of …, 2022 - proceedings.mlsys.org

This paper explores the environmental impact of the super-linear growth trends for AI from a
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …

被引用次数：376 相关文章所有 7 个版本

[PDF] arxiv.org

Communication-efficient distributed deep learning: A comprehensive survey

Z Tang, S Shi, W Wang, B Li, X Chu - arXiv preprint arXiv:2003.06307, 2020 - arxiv.org

Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …

被引用次数：134 相关文章所有 4 个版本

[PDF] thecvf.com

Lira: Learnable, imperceptible and robust backdoor attacks

K Doan, Y Lao, W Zhao, P Li - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Recently, machine learning models have demonstrated to be vulnerable to backdoor
attacks, primarily due to the lack of transparency in black-box models such as deep neural …

被引用次数：215 相关文章所有 5 个版本

[PDF] arxiv.org

Zero-infinity: Breaking the gpu memory wall for extreme scale deep learning

S Rajbhandari, O Ruwase, J Rasley, S Smith… - Proceedings of the …, 2021 - dl.acm.org

In the last three years, the largest dense deep learning models have grown over 1000x to
reach hundreds of billions of parameters, while the GPU memory has only grown by 5x (16 …

被引用次数：273 相关文章所有 5 个版本

[PDF] arxiv.org

RecSSD: near data processing for solid state drive based recommendation inference

M Wilkening, U Gupta, S Hsia, C Trippel… - Proceedings of the 26th …, 2021 - dl.acm.org

Neural personalized recommendation models are used across a wide variety of datacenter
applications including search, social media, and entertainment. State-of-the-art models …

被引用次数：101 相关文章所有 6 个版本

[PDF] arxiv.org

Understanding training efficiency of deep learning recommendation models at scale

B Acun, M Murphy, X Wang, J Nie… - … Symposium on High …, 2021 - ieeexplore.ieee.org

The use of GPUs has proliferated for machine learning workflows and is now considered
mainstream for many deep learning models. Meanwhile, when training state-of-the-art …

被引用次数：102 相关文章所有 5 个版本

[PDF] arxiv.org

RecShard: statistical feature-based memory optimization for industry-scale neural recommendation

G Sethi, B Acun, N Agarwal, C Kozyrakis… - Proceedings of the 27th …, 2022 - dl.acm.org

We propose RecShard, a fine-grained embedding table (EMB) partitioning and placement
technique for deep learning recommendation models (DLRMs). RecShard is designed …

被引用次数：50 相关文章所有 7 个版本

[PDF] arxiv.org

A comprehensive survey on trustworthy recommender systems

W Fan, X Zhao, X Chen, J Su, J Gao, L Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

As one of the most successful AI-powered applications, recommender systems aim to help
people make appropriate decisions in an effective and efficient way, by providing …

被引用次数：38 相关文章所有 3 个版本

[PDF] neurips.cc

Dreamshard: Generalizable embedding table placement for recommender systems

D Zha, L Feng, Q Tan, Z Liu, KH Lai… - Advances in …, 2022 - proceedings.neurips.cc

We study embedding table placement for distributed recommender systems, which aims to
partition and place the tables on multiple hardware devices (eg, GPUs) to balance the …

被引用次数：25 相关文章所有 6 个版本

[PDF] arxiv.org

HET: scaling out huge embedding model training via cache-enabled distributed framework

X Miao, H Zhang, Y Shi, X Nie, Z Yang, Y Tao… - arXiv preprint arXiv …, 2021 - arxiv.org

Embedding models have been an effective learning paradigm for high-dimensional data.
However, one open issue of embedding models is that their representations (latent factors) …

被引用次数：46 相关文章所有 6 个版本

高级搜索

QQ 群