Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org
At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

Applied machine learning at facebook: A datacenter infrastructure perspective

K Hazelwood, S Bird, D Brooks… - … symposium on high …, 2018 - ieeexplore.ieee.org
Machine learning sits at the core of many essential products and services at Facebook. This
paper describes the hardware and software infrastructure that supports machine learning at …

Deep learning training in facebook data centers: Design of scale-up and scale-out systems

M Naumov, J Kim, D Mudigere, S Sridharan… - arXiv preprint arXiv …, 2020 - arxiv.org
Large-scale training is important to ensure high performance and accuracy of machine-
learning models. At Facebook we use many different models, including computer vision …

Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications

J Park, M Naumov, P Basu, S Deng, A Kalaiah… - arXiv preprint arXiv …, 2018 - arxiv.org
The application of deep learning techniques resulted in remarkable improvement of
machine learning models. In this paper provides detailed characterizations of deep learning …

The architectural implications of facebook's dnn-based personalized recommendation

U Gupta, CJ Wu, X Wang, M Naumov… - … Symposium on High …, 2020 - ieeexplore.ieee.org
The widespread application of deep learning has changed the landscape of computation in
data centers. In particular, personalized recommendation for content ranking is now largely …

Cloud-based or on-device: An empirical study of mobile deep inference

T Guo - 2018 IEEE International Conference on Cloud …, 2018 - ieeexplore.ieee.org
Modern mobile applications are benefiting significantly from the advancement in deep
learning, eg, implementing real-time image recognition and conversational system. Given a …

Demystifying tensorrt: Characterizing neural network inference engine on nvidia edge devices

O Shafi, C Rai, R Sen… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Edge devices are seeing tremendous growth in sensing and computational capabilities.
Running state-of-the-art deep neural network (NN) based data processing on multi-core …

Google neural network models for edge devices: Analyzing and mitigating machine learning inference bottlenecks

A Boroumand, S Ghose, B Akin… - 2021 30th …, 2021 - ieeexplore.ieee.org
Emerging edge computing platforms often contain machine learning (ML) accelerators that
can accelerate inference for a wide range of neural network (NN) models. These models are …

Deep learning with edge computing: A review

J Chen, X Ran - Proceedings of the IEEE, 2019 - ieeexplore.ieee.org
Deep learning is currently widely used in a variety of applications, including computer vision
and natural language processing. End devices, such as smartphones and Internet-of-Things …

Recnmp: Accelerating personalized recommendation with near-memory processing

L Ke, U Gupta, BY Cho, D Brooks… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …