Adaptive inference through early-exit networks: Design, challenges and directions

S Laskaridis, A Kouris, ND Lane - … of the 5th International Workshop on …, 2021 - dl.acm.org
DNNs are becoming less and less over-parametrised due to recent advances in efficient
model design, through careful hand-crafted or NAS-based methods. Relying on the fact that …

Adaptive neural networks for efficient inference

T Bolukbasi, J Wang, O Dekel… - … on Machine Learning, 2017 - proceedings.mlr.press
We present an approach to adaptively utilize deep neural networks in order to reduce the
evaluation time on new examples without loss of accuracy. Rather than attempting to …

Improved techniques for training adaptive deep networks

H Li, H Zhang, X Qi, R Yang… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Adaptive inference is a promising technique to improve the computational efficiency of deep
models at test time. In contrast to static models which use the same computation graph for all …

Any-precision deep neural networks

H Yu, H Li, H Shi, TS Huang, G Hua - Proceedings of the AAAI …, 2021 - ojs.aaai.org
We present any-precision deep neural networks (DNNs), which are trained with a new
method that allows the learned DNNs to be flexible in numerical precision during inference …

Dual dynamic inference: Enabling more efficient, adaptive, and controllable deep inference

Y Wang, J Shen, TK Hu, P Xu, T Nguyen… - IEEE Journal of …, 2020 - ieeexplore.ieee.org
State-of-the-art convolutional neural networks (CNNs) yield record-breaking predictive
performance, yet at the cost of high-energy-consumption inference, that prohibits their widely …

Dynamic-ofa: Runtime dnn architecture switching for performance scaling on heterogeneous embedded platforms

W Lou, L Xun, A Sabet, J Bi, J Hare… - Proceedings of the …, 2021 - openaccess.thecvf.com
Mobile and embedded platforms are increasingly required to efficiently execute
computationally demanding DNNs across heterogeneous processing elements. At runtime …

Stonne: Enabling cycle-level microarchitectural simulation for dnn inference accelerators

F Muñoz-Martínez, JL Abellán… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
The design of specialized architectures for accelerating the inference procedure of Deep
Neural Networks (DNNs) is a booming area of research nowadays. While first-generation …

Discovering low-precision networks close to full-precision networks for efficient embedded inference

JL McKinstry, SK Esser, R Appuswamy… - arXiv preprint arXiv …, 2018 - arxiv.org
To realize the promise of ubiquitous embedded deep network inference, it is essential to
seek limits of energy and area efficiency. To this end, low-precision networks offer …

Learning to weight samples for dynamic early-exiting networks

Y Han, Y Pu, Z Lai, C Wang, S Song, J Cao… - European conference on …, 2022 - Springer
Early exiting is an effective paradigm for improving the inference efficiency of deep networks.
By constructing classifiers with varying resource demands (the exits), such networks allow …

Zero time waste: Recycling predictions in early exit neural networks

M Wołczyk, B Wójcik, K Bałazy… - Advances in …, 2021 - proceedings.neurips.cc
The problem of reducing processing time of large deep learning models is a fundamental
challenge in many real-world applications. Early exit methods strive towards this goal by …