CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models

H Zhang, Z Liu, B Chen, Y Zhao, T Zhao… - Proceedings of the …, 2024 - dl.acm.org
Recently, the growing memory demands of embedding tables in Deep Learning
Recommendation Models (DLRMs) pose great challenges for model training and …

Embedding Optimization for Training Large-scale Deep Learning Recommendation Systems with EMBark

S Liu, N Zheng, H Kang, X Simmons, J Zhang… - Proceedings of the 18th …, 2024 - dl.acm.org
Training large-scale deep learning recommendation models (DLRMs) with embedding
tables stretching across multiple GPUs in a cluster presents a unique challenge, demanding …

[PDF][PDF] DLRover-RM: Resource Optimization for Deep Recommendation Models Training in the Cloud

Q Wang, T Lan, Y Tang, B Sang, Z Huang, Y Du… - Proc. VLDB Endow …, 2024 - vldb.org
Deep learning recommendation models (DLRM) rely on large embedding tables to manage
categorical sparse features. Expanding such embedding tables can significantly enhance …

Mixed-Precision Embeddings for Large-Scale Recommendation Models

S Li, Z Hu, F Lyu, X Tang, H Wang, S Xu, W Luo… - arXiv preprint arXiv …, 2024 - arxiv.org
Embedding techniques have become essential components of large databases in the deep
learning era. By encoding discrete entities, such as words, items, or graph nodes, into …

Disaggregating Embedding Recommendation Systems with FlexEMR

Y Huang, Z Yang, J Xing, Y Dai, Y Qiu, D Wu… - arXiv preprint arXiv …, 2024 - arxiv.org
Efficiently serving embedding-based recommendation (EMR) models remains a significant
challenge due to their increasingly large memory requirements. Today's practice splits the …

Fine-Grained Embedding Dimension Optimization During Training for Recommender Systems

Q Luo, P Wang, W Zhang, F Lai, J Mao, X Wei… - arXiv preprint arXiv …, 2024 - arxiv.org
Huge embedding tables in modern Deep Learning Recommender Models (DLRM) require
prohibitively large memory during training and inference. Aiming to reduce the memory …

Towards a Flexible and High-Fidelity Approach to Distributed DNN Training Emulation

B Liu, MA Ojewale, Y Ding, M Canini - … of the 15th ACM SIGOPS Asia …, 2024 - dl.acm.org
We propose NeuronaBox, a flexible, user-friendly, and high-fidelity approach to emulate
DNN training workloads. We argue that to accurately observe performance, it is possible to …

Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs

R Jain, VM Bhasi, A Jog, A Sivasubramaniam… - arXiv preprint arXiv …, 2024 - arxiv.org
Personalized recommendation is a ubiquitous application on the internet, with many
industries and hyperscalers extensively leveraging Deep Learning Recommendation …

ERCache: An Efficient and Reliable Caching Framework for Large-Scale User Representations in Meta's Ads System

F Zhou, Y Huang, D Liang, D Li, Z Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org
The increasing complexity of deep learning models used for calculating user
representations presents significant challenges, particularly with limited computational …

Proof-of-Concept of a Flexible and High-Fidelity Approach to Distributed DNN Training Emulation

B Liu, MA Ojewale, Y Ding, M Canini - Proceedings of the 2024 …, 2024 - dl.acm.org
We propose NeuronaBox, a flexible, user-friendly, and high-fidelity approach to emulate
DNN training workloads. We argue that to accurately observe performance, it is possible to …