Deeprecsys: A system for optimizing end-to-end at-scale neural recommendation inference

U Gupta, S Hsia, V Saraph, X Wang… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Neural personalized recommendation is the cornerstone of a wide collection of cloud
services and products, constituting significant compute demand of cloud infrastructure. Thus …

Understanding capacity-driven scale-out neural recommendation inference

M Lui, Y Yetim, Ö Özkan, Z Zhao… - … Analysis of Systems …, 2021 - ieeexplore.ieee.org
Deep learning recommendation models have grown to the terabyte scale. Traditional
serving schemes-that load entire models to a single server-are unable to support this scale …

Recnmp: Accelerating personalized recommendation with near-memory processing

L Ke, U Gupta, BY Cho, D Brooks… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …

Recpipe: Co-designing models and hardware to jointly optimize recommendation quality and performance

U Gupta, S Hsia, J Zhang, M Wilkening… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Deep learning recommendation systems must provide high quality, personalized content
under strict tail-latency targets and high system loads. This paper presents RecPipe, a …

Tensor casting: Co-designing algorithm-architecture for personalized recommendation training

Y Kwon, Y Lee, M Rhu - 2021 IEEE International Symposium …, 2021 - ieeexplore.ieee.org
Personalized recommendations are one of the most widely deployed machine learning (ML)
workload serviced from cloud datacenters. As such, architectural solutions for high …

Understanding data storage and ingestion for large-scale deep recommendation model training: Industrial product

M Zhao, N Agarwal, A Basant, B Gedik, S Pan… - Proceedings of the 49th …, 2022 - dl.acm.org
Datacenter-scale AI training clusters consisting of thousands of domain-specific accelerators
(DSA) are used to train increasingly-complex deep learning models. These clusters rely on a …

The architectural implications of facebook's dnn-based personalized recommendation

U Gupta, CJ Wu, X Wang, M Naumov… - … Symposium on High …, 2020 - ieeexplore.ieee.org
The widespread application of deep learning has changed the landscape of computation in
data centers. In particular, personalized recommendation for content ranking is now largely …

Deep learning recommendation model for personalization and recommendation systems

M Naumov, D Mudigere, HJM Shi, J Huang… - arXiv preprint arXiv …, 2019 - arxiv.org
With the advent of deep learning, neural network-based recommendation models have
emerged as an important tool for tackling personalization and recommendation tasks. These …

Monolith: real time recommendation system with collisionless embedding table

Z Liu, L Zou, X Zou, C Wang, B Zhang, D Tang… - arXiv preprint arXiv …, 2022 - arxiv.org
Building a scalable and real-time recommendation system is vital for many businesses
driven by time-sensitive customer feedback, such as short-videos ranking or online ads …

Centaur: A chiplet-based, hybrid sparse-dense accelerator for personalized recommendations

R Hwang, T Kim, Y Kwon, M Rhu - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendations are the backbone machine learning (ML) algorithm that
powers several important application domains (eg, ads, e-commerce, etc) serviced from …