Applied machine learning at facebook: A datacenter infrastructure perspective

K Hazelwood, S Bird, D Brooks… - … symposium on high …, 2018 - ieeexplore.ieee.org
Machine learning sits at the core of many essential products and services at Facebook. This
paper describes the hardware and software infrastructure that supports machine learning at …

Machine learning at facebook: Understanding inference at the edge

CJ Wu, D Brooks, K Chen, D Chen… - … symposium on high …, 2019 - ieeexplore.ieee.org
At Facebook, machine learning provides a wide range of capabilities that drive many
aspects of user experience including ranking posts, content understanding, object detection …

Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications

J Park, M Naumov, P Basu, S Deng, A Kalaiah… - arXiv preprint arXiv …, 2018 - arxiv.org
The application of deep learning techniques resulted in remarkable improvement of
machine learning models. In this paper provides detailed characterizations of deep learning …

The architectural implications of facebook's dnn-based personalized recommendation

U Gupta, CJ Wu, X Wang, M Naumov… - … Symposium on High …, 2020 - ieeexplore.ieee.org
The widespread application of deep learning has changed the landscape of computation in
data centers. In particular, personalized recommendation for content ranking is now largely …

Deep learning training in facebook data centers: Design of scale-up and scale-out systems

M Naumov, J Kim, D Mudigere, S Sridharan… - arXiv preprint arXiv …, 2020 - arxiv.org
Large-scale training is important to ensure high performance and accuracy of machine-
learning models. At Facebook we use many different models, including computer vision …

FireSim: FPGA-accelerated cycle-exact scale-out system simulation in the public cloud

S Karandikar, H Mao, D Kim, D Biancolin… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org
We present FireSim, an open-source simulation platform that enables cycle-exact
microarchitectural simulation of large scale-out clusters by combining FPGA-accelerated …

Exploring serverless computing for neural network training

L Feng, P Kudva, D Da Silva… - 2018 IEEE 11th …, 2018 - ieeexplore.ieee.org
Serverless or functions as a service runtimes have shown significant benefits to efficiency
and cost for event-driven cloud applications. Although serverless runtimes are limited to …

Litz: Elastic framework for {High-Performance} distributed machine learning

A Qiao, A Aghayev, W Yu, H Chen, Q Ho… - 2018 USENIX Annual …, 2018 - usenix.org
Machine Learning (ML) is an increasingly popular application in the cloud and data-center,
inspiring new algorithmic and systems techniques that leverage unique properties of ML …

Recnmp: Accelerating personalized recommendation with near-memory processing

L Ke, U Gupta, BY Cho, D Brooks… - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …

Interactive supercomputing on 40,000 cores for machine learning and data analysis

A Reuther, J Kepner, C Byun, S Samsi… - 2018 IEEE High …, 2018 - ieeexplore.ieee.org
Interactive massively parallel computations are critical for machine learning and data
analysis. These computations are a staple of the MIT Lincoln Laboratory Supercomputing …