The deep learning compiler: A comprehensive survey

M Li, Y Liu, X Liu, Q Sun, X You, H Yang… - … on Parallel and …, 2020 - ieeexplore.ieee.org
The difficulty of deploying various deep learning (DL) models on diverse DL hardware has
boosted the research and development of DL compilers in the community. Several DL …

Nn-meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices

LL Zhang, S Han, J Wei, N Zheng, T Cao… - Proceedings of the 19th …, 2021 - dl.acm.org
With the recent trend of on-device deep learning, inference latency has become a crucial
metric in running Deep Neural Network (DNN) models on various mobile and edge devices …

WACO: learning workload-aware co-optimization of the format and schedule of a sparse tensor program

J Won, C Mendis, JS Emer… - Proceedings of the 28th …, 2023 - dl.acm.org
In this paper, we present WACO, a novel method of co-optimizing the format and the
schedule of a given sparsity pattern in a sparse tensor program. A core challenge in this …

Nar-former: Neural architecture representation learning towards holistic attributes prediction

Y Yi, H Zhang, W Hu, N Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
With the wide and deep adoption of deep learning models in real applications, there is an
increasing need to model and learn the representations of the neural networks themselves …

Archgym: An open-source gymnasium for machine learning assisted architecture design

S Krishnan, A Yazdanbakhsh, S Prakash… - Proceedings of the 50th …, 2023 - dl.acm.org
Machine learning (ML) has become a prevalent approach to tame the complexity of design
space exploration for domain-specific architectures. While appealing, using ML for design …

Nnlqp: A multi-platform neural network latency query and prediction system with an evolving database

L Liu, M Shen, R Gong, F Yu, H Yang - Proceedings of the 51st …, 2022 - dl.acm.org
Deep neural networks (DNNs) are widely used in various applications. The accurate and
latency feedback is essential for model design and deployment. In this work, we attempt to …

CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs

H Hu, J Su, J Zhao, Y Peng, Y Zhu, H Lin… - Proceedings of the …, 2024 - dl.acm.org
Deep Neural Networks (DNNs) have shown excellent performance in a wide range of
machine learning applications. Knowing the latency of running a DNN model or tensor …

All-Sky Autonomous Computing in UAV Swarm

H Sun, Y Qu, C Dong, H Dai, Z Li… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Unmanned aerial vehicles (UAVs) play an essential role in emergency cases and adverse
environments for applications like disaster detection and mine exploration. To process the …

A uniform latency model for dnn accelerators with diverse architectures and dataflows

L Mei, H Liu, T Wu, HE Sumbul… - … , Automation & Test …, 2022 - ieeexplore.ieee.org
In the early design phase of a Deep Neural Network (DNN) acceleration system, fast energy
and latency estimation are important to evaluate the optimality of different design candidates …

[PDF][PDF] Precious: Resource-demand estimation for embedded neural network accelerators

S Reif, B Herzog, J Hemp, T Hönig… - … Learning Workloads on …, 2020 - cs.fau.de
The recent advances of hardware-based accelerators for machine learning—in particular
neural networks—attracted the attention of embedded-systems designers and engineers …