Astraea: A fair deep learning scheduler for multi-tenant gpu clusters

Z Ye, P Sun, W Gao, T Zhang, X Wang… - … on Parallel and …, 2021 - ieeexplore.ieee.org
Modern GPU clusters are designed to support distributed Deep Learning jobs from multiple
tenants concurrently. Each tenant may have varied and dynamic resource demands …

RTGPU: Real-time GPU scheduling of hard deadline parallel tasks with fine-grain utilization

A Zou, J Li, CD Gill, X Zhang - IEEE Transactions on Parallel …, 2023 - ieeexplore.ieee.org
Many emerging cyber-physical systems, such as autonomous vehicles and robots, rely
heavily on artificial intelligence and machine learning algorithms to perform important …

Fractional GPUs: Software-based compute and memory bandwidth reservation for GPUs

S Jain, I Baek, S Wang… - 2019 IEEE Real-Time and …, 2019 - ieeexplore.ieee.org
GPUs are increasingly being used in real-time systems, such as autonomous vehicles, due
to the vast performance benefits that they offer. As more and more applications use GPUs …

Fine-grain task aggregation and coordination on GPUs

MS Orr, BM Beckmann, SK Reinhardt… - ACM SIGARCH …, 2014 - dl.acm.org
In general-purpose graphics processing unit (GPGPU) computing, data is processed by
concurrent threads execut-ing the same function. This model, dubbed single …

AEML: An acceleration engine for multi-GPU load-balancing in distributed heterogeneous environment

Z Tang, L Du, X Zhang, L Yang… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
For the rapid growth computation requirements in big data and artificial intelligence area,
CPU-GPU heterogeneous clusters can provide more powerful computing capacity …

Improving gpu multi-tenancy with page walk stealing

B Pratheek, N Jawalkar, A Basu - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
GPU (Graphics Processing Unit) architecture has evolved to accelerate parts of a single
application at a time. Consequently, several aspects of its architecture, particularly the virtual …

Zico: Efficient {GPU} memory sharing for concurrent {DNN} training

G Lim, J Ahn, W Xiao, Y Kwon, M Jeon - 2021 USENIX Annual Technical …, 2021 - usenix.org
GPUs are the workhorse in modern server infrastructure fueling advances in a number of
compute-intensive workloads such as deep neural network (DNN) training. Several recent …

Interference-aware parallelization for deep learning workload in GPU cluster

X Geng, H Zhang, Z Zhao, H Ma - Cluster Computing, 2020 - Springer
With the widespread use of GPUs for performing deep learning applications, the issue of
efficient execution of multiple deep learning jobs in a GPU cluster has attracted great …

Characterization and prediction of performance interference on mediated passthrough {GPUs} for interference-aware scheduler

X Xu, N Zhang, M Cui, M He, R Surana - 11th USENIX Workshop on Hot …, 2019 - usenix.org
Sharing GPUs in the cloud is cost effective and can facilitate the adoption of hardware
accelerator enabled cloud. Butsharing causes interference between co-located VMs …

Deep learning research and development platform: Characterizing and scheduling with qos guarantees on gpu clusters

Z Chen, W Quan, M Wen, J Fang, J Yu… - … on Parallel and …, 2019 - ieeexplore.ieee.org
Deep learning (DL) has been widely adopted in various domains of artificial intelligence (AI),
achieving dramatic developments in industry and academia. Besides giant AI companies …