A survey of deep learning on cpus: opportunities and co-optimizations

S Mittal, P Rajput, S Subramoney - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
CPU is a powerful, pervasive, and indispensable platform for running deep learning (DL)
workloads in systems ranging from mobile to extreme-end servers. In this article, we present …

Characterizing deep learning training workloads on alibaba-pai

M Wang, C Meng, G Long, C Wu… - 2019 IEEE …, 2019 - ieeexplore.ieee.org
Modern deep learning models have been exploited in various domains, including computer
vision (CV), natural language processing (NLP), search and recommendation. In practical AI …

AIBench: an industry standard internet service AI benchmark suite

W Gao, F Tang, L Wang, J Zhan, C Lan, C Luo… - arXiv preprint arXiv …, 2019 - arxiv.org
Today's Internet Services are undergoing fundamental changes and shifting to an intelligent
computing era where AI is widely employed to augment services. In this context, many …

Enabling serverless deployment of large-scale ai workloads

A Christidis, S Moschoyiannis, CH Hsu… - IEEE Access, 2020 - ieeexplore.ieee.org
We propose a set of optimization techniques for transforming a generic AI codebase so that
it can be successfully deployed to a restricted serverless environment, without compromising …

[HTML][HTML] A BenchCouncil view on benchmarking emerging and future computing

J Zhan - BenchCouncil Transactions on Benchmarks, Standards …, 2022 - Elsevier
The measurable properties of the artifacts or objects in the computer, management, or
finance disciplines are extrinsic, not inherent—dependent on their problem definitions and …

AIBench: towards scalable and comprehensive datacenter AI benchmarking

W Gao, C Luo, L Wang, X Xiong, J Chen, T Hao… - … , and Optimizing: First …, 2019 - Springer
AI benchmarking provides yardsticks for benchmarking, measuring and evaluating
innovative AI algorithms, architecture, and systems. Coordinated by BenchCouncil, this …

Bigdatabench: A scalable and unified big data and ai benchmark suite

W Gao, J Zhan, L Wang, C Luo, D Zheng… - arXiv preprint arXiv …, 2018 - arxiv.org
Several fundamental changes in technology indicate domain-specific hardware and
software co-design is the only path left. In this context, architecture, system, data …

Serving machine learning workloads in resource constrained environments: A serverless deployment example

A Christidis, R Davies… - 2019 IEEE 12th …, 2019 - ieeexplore.ieee.org
Deployed AI platforms typically ship with bulky system architectures which present
bottlenecks and a high risk of failure. A serverless deployment can mitigate these factors and …

Profiling gem5 Simulator

J Umeike, N Patel, A Manley… - … Analysis of Systems …, 2023 - ieeexplore.ieee.org
In this work, we set out to find the answers to the following questions:(1) Where are the
bottlenecks in a state-of-theart architectural simulator?(2) How much faster can architectural …

Aibench scenario: Scenario-distilling ai benchmarking

W Gao, F Tang, J Zhan, X Wen, L Wang… - 2021 30th …, 2021 - ieeexplore.ieee.org
Modern real-world application scenarios like Internet services consist of a diversity of AI and
non-AI modules with huge code sizes and long and complicated execution paths, which …