Optimizing deep learning recommender systems training on cpu cluster architectures

D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch… - Proceedings of the 49th …, 2022 - dl.acm.org

Deep learning recommendation models (DLRMs) have been used across many business-
critical services at Meta and are the single largest AI application in terms of infrastructure …

被引用次数：93 相关文章所有 7 个版本

[PDF] acm.org

Trim: Enhancing processor-memory interfaces with scalable tensor reduction in memory

J Park, B Kim, S Yun, E Lee, M Rhu… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Personalized recommendation systems are gaining significant traction due to their industrial
importance. An important building block of recommendation systems consists of the …

被引用次数：53 相关文章所有 4 个版本

Stratix 10 NX architecture and applications

M Langhammer, E Nurvitadhi, B Pasca… - The 2021 ACM/SIGDA …, 2021 - dl.acm.org

The advent of AI has driven the adoption of high density low precision arithmetic on FPGAs.
This has resulted in new methods in mapping both arithmetic functions as well as dataflows …

被引用次数：45 相关文章

[PDF] arxiv.org

Cross-stack workload characterization of deep recommendation systems

S Hsia, U Gupta, M Wilkening, CJ Wu… - 2020 IEEE …, 2020 - ieeexplore.ieee.org

Deep learning based recommendation systems form the backbone of most personalized
cloud services. Though the computer architecture community has recently started to take …

被引用次数：33 相关文章所有 8 个版本

[PDF] acm.org

Kairos: Building cost-efficient machine learning inference systems with heterogeneous cloud resources

B Li, S Samsi, V Gadepally, D Tiwari - Proceedings of the 32nd …, 2023 - dl.acm.org

Online inference is becoming a key service product for many businesses, deployed in cloud
platforms to meet customer demands. Despite their revenue-generation capability, these …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Heterogeneous acceleration pipeline for recommendation system training

M Adnan, YE Maboud, D Mahajan… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org

Recommendation models rely on deep learning networks and large embedding tables,
resulting in computationally and memory-intensive processes. These models are typically …

被引用次数：14 相关文章所有 4 个版本

[PDF] acm.org

Ribbon: cost-effective and qos-aware deep learning model inference using a diverse pool of cloud computing instances

B Li, RB Roy, T Patel, V Gadepally, K Gettings… - Proceedings of the …, 2021 - dl.acm.org

Deep learning model inference is a key service in many businesses and scientific discovery
processes. This paper introduces Ribbon, a novel deep learning inference serving system …

被引用次数：15 相关文章所有 7 个版本

[PDF] arxiv.org

Tensor processing primitives: A programming abstraction for efficiency and portability in deep learning workloads

E Georganas, D Kalamkar, S Avancha… - Proceedings of the …, 2021 - dl.acm.org

During the past decade, novel Deep Learning (DL) algorithms/workloads and hardware
have been developed to tackle a wide range of problems. Despite the advances in …

被引用次数：20 相关文章所有 7 个版本

Accelerating Personalized Recommendation with Cross-level Near-Memory Processing

H Liu, L Zheng, Y Huang, C Liu, X Ye, J Yuan… - Proceedings of the 50th …, 2023 - dl.acm.org

The memory-intensive embedding layers of the personalized recommendation systems are
the performance bottleneck as they demand large memory bandwidth and exhibit irregular …

被引用次数：5 相关文章

[PDF] neurips.cc

The trade-offs of model size in large recommendation models: 100GB to 10MB Criteo-tb DLRM model

A Desai, A Shrivastava - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Embedding tables dominate industrial-scale recommendation model sizes, using up to
terabytes of memory. A popular and the largest publicly available machine learning MLPerf …

被引用次数：6 相关文章所有 3 个版本

高级搜索

QQ 群