相关文章- 学术资源搜索

Deadline-aware offloading for high-throughput accelerators

TT Yeh, MD Sinclair, BM Beckmann… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Contemporary GPUs are widely used for throughput-oriented data-parallel workloads and
increasingly are being considered for latency-sensitive applications in datacenters …

被引用次数：13 相关文章所有 10 个版本

[PDF] acm.org

Deep learning workload scheduling in gpu datacenters: A survey

Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo… - ACM Computing …, 2024 - dl.acm.org

Deep learning (DL) has demonstrated its remarkable success in a wide variety of fields. The
development of a DL model is a time-consuming and resource-intensive procedure. Hence …

被引用次数：4 相关文章所有 4 个版本

[PDF] arxiv.org

D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs

A Dhakal, SG Kulkarni, KK Ramakrishnan - arXiv preprint arXiv …, 2023 - arxiv.org

Hardware accelerators such as GPUs are required for real-time, low-latency inference with
Deep Neural Networks (DNN). However, due to the inherent limits to the parallelism they …

被引用次数：2 相关文章所有 2 个版本

ArkGPU: enabling applications' high-goodput co-location execution on multitasking GPUs

J Lou, Y Sun, J Zhang, H Cao, Y Zhang… - CCF Transactions on High …, 2023 - Springer

With the development of deep learning, hardware accelerators represented by GPUs have
been used to accelerate the execution of deep learning applications. A key problem in GPU …

被引用次数：2 相关文章

[PDF] arxiv.org

Characterizing concurrency mechanisms for nvidia gpus under deep learning workloads

G Gilman, RJ Walls - ACM SIGMETRICS Performance Evaluation …, 2022 - dl.acm.org

Hazelwood et al. observed that at Facebook data centers, variations in user activity (eg due
to diurnal load) resulted in low utilization periods with large pools of idle resources [4]. To …

被引用次数：16 相关文章所有 7 个版本

DeepBoot: Dynamic Scheduling System for Training and Inference Deep Learning Tasks in GPU Cluster

Z Chen, X Zhao, C Zhi, J Yin - IEEE Transactions on Parallel …, 2023 - ieeexplore.ieee.org

Deep learning tasks (DLT) include training and inference tasks, where training DLTs have
requirements on minimizing average job completion time (JCT) and inference tasks need …

Ebird: Elastic batch for improving responsiveness and throughput of deep learning services

W Cui, M Wei, Q Chen, X Tang, J Leng… - 2019 IEEE 37th …, 2019 - ieeexplore.ieee.org

GPUs have been widely adopted to serve online deep learning-based services that have
stringent QoS requirements. However, emerging deep learning serving systems often result …

被引用次数：27 相关文章所有 5 个版本

[PDF] ubc.ca

Edge: Event-driven gpu execution

TH Hetherington, M Lubeznov, D Shah… - 2019 28th …, 2019 - ieeexplore.ieee.org

GPUs are known to benefit structured applications with ample parallelism, such as deep
learning in a datacenter. Recently, GPUs have shown promise for irregular streaming …

被引用次数：10 相关文章所有 5 个版本

CARSS: Client-aware resource sharing and scheduling for heterogeneous applications

I Baek, M Harding, A Kanda, KR Choi… - 2020 IEEE Real …, 2020 - ieeexplore.ieee.org

Modern hardware accelerators such as GP-GPUs and DSPs are commonly being used in
real-time settings such as high-performance multimedia systems and autonomous vehicles …

被引用次数：7 相关文章所有 4 个版本

[PDF] nsf.gov

SchedTune: A heterogeneity-aware GPU scheduler for deep learning

H Albahar, S Dongare, Y Du, N Zhao… - 2022 22nd IEEE …, 2022 - ieeexplore.ieee.org

Modern cluster management systems, such as Kubernetes, support heterogeneous
workloads and resources. However, existing resource schedulers in these systems do not …

被引用次数：13 相关文章所有 7 个版本

高级搜索

QQ 群