Scheduling real-time deep learning services as imprecise computations

S Yao, Y Hao, Y Zhao, H Shao, D Liu… - 2020 IEEE 26th …, 2020 - ieeexplore.ieee.org
The paper presents a real-time computing framework for intelligent real-time edge services,
on behalf of local embedded devices that are themselves unable to support extensive …

Deeprt: A soft real time scheduler for computer vision applications on the edge

Z Yang, K Nahrstedt, H Guo… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org
The ubiquity of smartphone cameras and IoT cameras, together with the recent boom of
deep learning and deep neural networks, proliferate various computer vision driven mobile …

Lalarand: Flexible layer-by-layer cpu/gpu scheduling for real-time dnn tasks

W Kang, K Lee, J Lee, I Shin… - 2021 IEEE Real-Time …, 2021 - ieeexplore.ieee.org
Deep neural networks (DNNs) have shown remarkable success in various machine-learning
(ML) tasks useful for many safety-critical, real-time embedded systems. The foremost design …

Pipelined data-parallel CPU/GPU scheduling for multi-DNN real-time inference

Y Xiang, H Kim - 2019 IEEE Real-Time Systems Symposium …, 2019 - ieeexplore.ieee.org
Deep neural networks (DNNs) have been showing significant success in various
applications, such as autonomous driving, mobile devices, and Internet of Things. Although …

Apnet: Approximation-aware real-time neural network

S Bateni, C Liu - 2018 IEEE Real-Time Systems Symposium …, 2018 - ieeexplore.ieee.org
Modern embedded cyber-physical systems are becoming entangled with the realm of deep
neural networks (DNNs) towards increased autonomy. While applying DNNs can …

On removing algorithmic priority inversion from mission-critical machine inference pipelines

S Liu, S Yao, X Fu, R Tabish, S Yu… - 2020 IEEE Real …, 2020 - ieeexplore.ieee.org
The paper discusses algorithmic priority inversion in mission-critical machine inference
pipelines used in modern neural-network-based cyber-physical applications, and develops …

SEE: Scheduling early exit for mobile DNN inference during service outage

Z Wang, W Bao, D Yuan, L Ge, NH Tran… - Proceedings of the 22nd …, 2019 - dl.acm.org
In recent years, the rapid development of edge computing enables us to process a wide
variety of intelligent applications at the edge, such as real-time video analytics. However …

Qos-aware scheduling of heterogeneous servers for inference in deep neural networks

Z Fang, T Yu, OJ Mengshoel, RK Gupta - Proceedings of the 2017 ACM …, 2017 - dl.acm.org
Deep neural networks (DNNs) are popular in diverse fields such as computer vision and
natural language processing. DNN inference tasks are emerging as a service provided by …

Zygarde: Time-sensitive on-device deep inference and adaptation on intermittently-powered systems

B Islam, S Nirjon - arXiv preprint arXiv:1905.03854, 2019 - arxiv.org
We propose Zygarde--which is an energy--and accuracy-aware soft real-time task
scheduling framework for batteryless systems that flexibly execute deep learning tasks1 that …

Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units

Y Choi, M Rhu - 2020 IEEE International Symposium on High …, 2020 - ieeexplore.ieee.org
To amortize cost, cloud vendors providing DNN acceleration as a service to end-users
employ consolidation and virtualization to share the underlying resources among multiple …