Fleche: an efficient GPU embedding cache for personalized recommendations

M Xie, Y Lu, J Lin, Q Wang, J Gao, K Ren… - Proceedings of the …, 2022 - dl.acm.org
Deep learning based models have dominated current production recommendation systems.
However, the gap between CPU-side DRAM data accessing and GPU processing still …

Training personalized recommendation systems from (GPU) scratch: Look forward not backwards

Y Kwon, M Rhu - Proceedings of the 49th Annual International …, 2022 - dl.acm.org
Personalized recommendation models (RecSys) are one of the most popular machine
learning workload serviced by hyperscalers. A critical challenge of training RecSys is its …

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

Y Zhang, L Chen, S Yang, M Yuan, H Yi… - 2022 IEEE 38th …, 2022 - ieeexplore.ieee.org
The development of personalized recommendation has significantly improved the accuracy
of information matching and the revenue of e-commerce platforms. Recently, it has two …

cDLRM: Look ahead caching for scalable training of recommendation models

K Balasubramanian, A Alshabanah, JD Choe… - Proceedings of the 15th …, 2021 - dl.acm.org
Deep learning recommendation models (DLRMs) are typically composed of two sets of
parameters: large embedding tables to handle sparse categorical inputs, and neural …

Accelerating recommendation system training by leveraging popular choices

M Adnan, YE Maboud, D Mahajan, PJ Nair - arXiv preprint arXiv …, 2021 - arxiv.org
Recommender models are commonly used to suggest relevant items to a user for e-
commerce and online advertisement-based applications. These models use massive …

Heterogeneous acceleration pipeline for recommendation system training

M Adnan, YE Maboud, D Mahajan… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Recommendation models rely on deep learning networks and large embedding tables,
resulting in computationally and memory-intensive processes. These models are typically …

Centaur: A chiplet-based, hybrid sparse-dense accelerator for personalized recommendations

R Hwang, T Kim, Y Kwon, M Rhu - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendations are the backbone machine learning (ML) algorithm that
powers several important application domains (eg, ads, e-commerce, etc) serviced from …

Optimizing cpu performance for recommendation systems at-scale

R Jain, S Cheng, V Kalagi, V Sanghavi, S Kaul… - Proceedings of the 50th …, 2023 - dl.acm.org
Deep Learning Recommendation Models (DLRMs) are very popular in personalized
recommendation systems and are a major contributor to the data-center AI cycles. Due to the …

Merlin hugeCTR: GPU-accelerated recommender system training and inference

Z Wang, Y Wei, M Lee, M Langer, F Yu, J Liu… - Proceedings of the 16th …, 2022 - dl.acm.org
In this talk, we introduce Merlin HugeCTR. Merlin HugeCTR is an open source, GPU-
accelerated integration framework for click-through rate estimation. It optimizes both training …

GPU accelerated feature engineering and training for recommender systems

B Schifferer, G Titericz, C Deotte, C Henkel… - Proceedings of the …, 2020 - dl.acm.org
In this paper we present our 1st place solution of the RecSys Challenge 2020 which focused
on the prediction of user behavior, specifically the interaction with content, on this year's …