相关文章- 学术资源搜索

Analyzing machine learning workloads using a detailed GPU simulator

J Lew, DA Shah, S Pati, S Cattell… - … analysis of systems …, 2019 - ieeexplore.ieee.org

Machine learning (ML) has recently emerged as an important application driving future
architecture design. Traditionally, architecture research has used detailed simulators to …

被引用次数：79 相关文章所有 11 个版本

Las: locality-aware scheduling for GEMM-accelerated convolutions in GPUs

H Kim, WJ Song - IEEE Transactions on Parallel and …, 2023 - ieeexplore.ieee.org

This article presents a graphics processing unit (GPU) scheduling scheme that maximizes
the exploitation of data locality in deep neural networks (DNNs). Convolution is one of the …

被引用次数：6 相关文章所有 4 个版本

[PDF] acm.org

Sia: Heterogeneity-aware, goodput-optimized ML-cluster scheduling

S Jayaram Subramanya, D Arfeen, S Lin… - Proceedings of the 29th …, 2023 - dl.acm.org

The Sia scheduler efficiently assigns heterogeneous deep learning (DL) cluster resources to
elastic resource-adaptive jobs. Although some recent schedulers address one aspect or …

被引用次数：13 相关文章所有 5 个版本

[PDF] arxiv.org

Elastic deep learning in multi-tenant GPU clusters

Y Wu, K Ma, X Yan, Z Liu, Z Cai… - … on Parallel and …, 2021 - ieeexplore.ieee.org

We study how to support elasticity, that is, the ability to dynamically adjust the parallelism (ie,
the number of GPUs), for deep neural network (DNN) training in a GPU cluster. Elasticity can …

被引用次数：47 相关文章所有 6 个版本

[PDF] arxiv.org

Characterizing concurrency mechanisms for nvidia gpus under deep learning workloads

G Gilman, RJ Walls - ACM SIGMETRICS Performance Evaluation …, 2022 - dl.acm.org

Hazelwood et al. observed that at Facebook data centers, variations in user activity (eg due
to diurnal load) resulted in low utilization periods with large pools of idle resources [4]. To …

被引用次数：16 相关文章所有 7 个版本

[PDF] arxiv.org

Multi-model machine learning inference serving with gpu spatial partitioning

S Choi, S Lee, Y Kim, J Park, Y Kwon, J Huh - arXiv preprint arXiv …, 2021 - arxiv.org

As machine learning techniques are applied to a widening range of applications, high
throughput machine learning (ML) inference servers have become critical for online service …

被引用次数：18 相关文章所有 2 个版本

[PDF] acm.org

Topology-aware gpu scheduling for learning workloads in cloud environments

M Amaral, J Polo, D Carrera, S Seelam… - Proceedings of the …, 2017 - dl.acm.org

Recent advances in hardware, such as systems with multiple GPUs and their availability in
the cloud, are enabling deep learning in various domains including health care …

被引用次数：73 相关文章所有 7 个版本

[PDF] arxiv.org

Aryl: An elastic cluster scheduler for deep learning

J Li, H Xu, Y Zhu, Z Liu, C Guo, C Wang - arXiv preprint arXiv:2202.07896, 2022 - arxiv.org

Companies build separate training and inference GPU clusters for deep learning, and use
separate schedulers to manage them. This leads to problems for both training and inference …

被引用次数：11 相关文章所有 2 个版本

[PDF] archive.org

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs

M Li, W Xiao, H Yang, B Sun, H Zhao, S Ren… - Proceedings of the …, 2023 - dl.acm.org

Distributed synchronized GPU training is commonly used for deep learning. The resource
constraint of using a fixed number of GPUs makes large-scale training jobs suffer from long …

被引用次数：2 相关文章所有 4 个版本

[PDF] usenix.org

Tiresias: A {GPU} cluster manager for distributed deep learning

J Gu, M Chowdhury, KG Shin, Y Zhu, M Jeon… - … USENIX Symposium on …, 2019 - usenix.org

Deep learning (DL) training jobs bring some unique challenges to existing cluster
managers, such as unpredictable training times, an all-or-nothing execution model, and …

被引用次数：378 相关文章所有 13 个版本

高级搜索

QQ 群

Analyzing machine learning workloads using a detailed GPU simulator

Las: locality-aware scheduling for GEMM-accelerated convolutions in GPUs

Sia: Heterogeneity-aware, goodput-optimized ML-cluster scheduling

Elastic deep learning in multi-tenant GPU clusters

Characterizing concurrency mechanisms for nvidia gpus under deep learning workloads

Multi-model machine learning inference serving with gpu spatial partitioning

Topology-aware gpu scheduling for learning workloads in cloud environments

Aryl: An elastic cluster scheduler for deep learning

EasyScale: Elastic Training with Consistent Accuracy and Improved Utilization on GPUs

Tiresias: A {GPU} cluster manager for distributed deep learning

引用