Optimizing DNN computation with relaxed graph substitutions

W Wang, M Zhang, G Chen, HV Jagadish, BC Ooi… - ACM Sigmod …, 2016 - dl.acm.org

Deep learning has recently become very popular on account of its incredible success in
many complex datadriven applications, including image classification and speech …

被引用次数：193 相关文章所有 7 个版本

[PDF] arxiv.org

Enabling resource-efficient aiot system with cross-level optimization: A survey

S Liu, B Guo, C Fang, Z Wang, S Luo… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …

被引用次数：23 相关文章所有 6 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：4067 相关文章所有 2 个版本

[PDF] acm.org

TASO: optimizing deep learning computation with automatic generation of graph substitutions

Z Jia, O Padon, J Thomas, T Warszawski… - Proceedings of the 27th …, 2019 - dl.acm.org

Existing deep neural network (DNN) frameworks optimize the computation graph of a DNN
by applying graph transformations manually designed by human experts. This approach …

被引用次数：289 相关文章所有 14 个版本

[PDF] mlsys.org

Data movement is all you need: A case study on optimizing transformers

A Ivanov, N Dryden, T Ben-Nun, S Li… - … of Machine Learning …, 2021 - proceedings.mlsys.org

Transformers are one of the most important machine learning workloads today. Training one
is a very compute-intensive task, often taking days or weeks, and significant attention has …

被引用次数：144 相关文章所有 20 个版本

[PDF] usenix.org

Optimizing {CNN} model inference on {CPUs}

Y Liu, Y Wang, R Yu, M Li, V Sharma… - 2019 USENIX Annual …, 2019 - usenix.org

The popularity of Convolutional Neural Network (CNN) models and the ubiquity of CPUs
imply that better performance of CNN model inference on CPUs can deliver significant gain …

被引用次数：180 相关文章所有 11 个版本

[PDF] mlsys.org

Apollo: Automatic partition-based operator fusion through layer by layer optimization

J Zhao, X Gao, R Xia, Z Zhang… - Proceedings of …, 2022 - proceedings.mlsys.org

We study fusion for deep neural networks (DNNs) in a just-in-time (JIT) compilation
framework Apollo. It considers both memory-and compute-bound tensor operators for fusion …

被引用次数：42 相关文章所有 4 个版本

[PDF] acm.org

Understanding and bridging the gaps in current GNN performance optimizations

K Huang, J Zhai, Z Zheng, Y Yi, X Shen - Proceedings of the 26th ACM …, 2021 - dl.acm.org

Graph Neural Network (GNN) has recently drawn a rapid increase of interest in many
domains for its effectiveness in learning over graphs. Maximizing its performance is …

被引用次数：82 相关文章所有 6 个版本

[PDF] usenix.org

{PET}: Optimizing tensor programs with partially equivalent transformations and automated corrections

H Wang, J Zhai, M Gao, Z Ma, S Tang, L Zheng… - … USENIX Symposium on …, 2021 - usenix.org

High-performance tensor programs are critical for efficiently deploying deep neural network
(DNN) models in real-world tasks. Existing frameworks optimize tensor programs by …

被引用次数：66 相关文章所有 12 个版本

[PDF] acm.org

Quartz: superoptimization of quantum circuits

M Xu, Z Li, O Padon, S Lin, J Pointing, A Hirth… - Proceedings of the 43rd …, 2022 - dl.acm.org

Existing quantum compilers optimize quantum circuits by applying circuit transformations
designed by experts. This approach requires significant manual effort to design and …

被引用次数：53 相关文章所有 13 个版本

高级搜索

QQ 群