相关文章- 学术资源搜索

Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications

F Strati, X Ma, A Klimovic - … of the Nineteenth European Conference on …, 2024 - dl.acm.org

GPUs are critical for maximizing the throughput-per-Watt of deep neural network (DNN)
applications. However, DNN applications often underutilize GPUs, even when using large …

被引用次数：7 相关文章所有 3 个版本

[PDF] uma.es

Strategies for maximizing utilization on multi-CPU and multi-GPU heterogeneous architectures

A Navarro, A Vilches, F Corbera, R Asenjo - The Journal of …, 2014 - Springer

This paper explores the possibility of efficiently executing a single application using
multicores simultaneously with multiple GPU accelerators under a parallel task …

被引用次数：37 相关文章所有 7 个版本

[PDF] ieee.org

Turbomgnn: Improving concurrent gnn training tasks on gpu with fine-grained kernel fusion

W Wu, X Shi, L He, H Jin - IEEE Transactions on Parallel and …, 2023 - ieeexplore.ieee.org

Graph Neural Networks (GNN) have evolved as powerful models for graph representation
learning. Many works have been proposed to support GNN training efficiently on GPU …

被引用次数：6 相关文章所有 4 个版本

[PDF] uark.edu

Scaling scientific applications on clusters of hybrid multicore/GPU nodes

L Wang, M Huang, VK Narayana… - Proceedings of the 8th …, 2011 - dl.acm.org

Rapid advances in the performance and programmability of graphics accelerators have
made GPU computing a compelling solution for a wide variety of application domains …

被引用次数：34 相关文章所有 7 个版本

[PDF] ieee.org

Characterizing fine-grained resource utilization for multitasking GPGPU in cloud systems

K Cho, H Bahn - IEEE Access, 2021 - ieeexplore.ieee.org

Managing GPGPU resources in cloud systems is challenging as workloads with various
resource usage patterns coexist. To determine the co-location of workloads, previous …

被引用次数：2 相关文章所有 3 个版本

[PDF] github.io

Raise: Efficient gpu resource management via hybrid scheduling

Y Weng, T Ge, X Zhang, X Zhang… - 2022 22nd IEEE …, 2022 - ieeexplore.ieee.org

As the de facto high-throughput accelerators, graphics processing units (G PU s) are now
used in a wide spec-trum of fields, including artificial intelligence, high performance …

被引用次数：3 相关文章所有 4 个版本

[PDF] ncsu.edu

[PDF][PDF] A programming model for massive data parallelism with data dependencies

Y Zhang, F Mueller, X Cui, T Potok - Workshop on Programming …, 2009 - arcb.csc.ncsu.edu

Accelerating processors can often be more cost and energy effective for a wide range of
data-parallel computing problems than general-purpose processors. For graphics processor …

被引用次数：4 相关文章所有 5 个版本

Analyzing fine-grained resource utilization for efficient GPU workload allocation

Y Park, D Shin, K Cho, H Bahn - The Journal of The Institute of …, 2019 - koreascience.kr

Recently, GPU expands application domains from graphic processing to various kinds of
parallel workloads. However, current GPU systems focus on the maximization of each …

被引用次数：5 相关文章

[PDF] github.io

Tacker: Tensor-cuda core kernel fusion for improving the gpu utilization while ensuring qos

H Zhao, W Cui, Q Chen, Y Zhang, Y Lu… - … Symposium on High …, 2022 - ieeexplore.ieee.org

The proliferation of machine learning applications has promoted both CUDA Cores and
Tensor Cores' integration to meet their acceleration demands. While studies have shown …

被引用次数：16 相关文章所有 3 个版本

[PDF] tsinghua.edu.cn

Gemma in April: A matrix-like parallel programming architecture on OpenCL

T Wu, D Wu, Y Wang, X Zhang, H Luo… - … , Automation & Test …, 2011 - ieeexplore.ieee.org

Nowadays, Graphics Processing Unit (GPU), as a kind of massive parallel processor, has
been widely used in general purposed computing tasks. Although there have been mature …

被引用次数：3 相关文章所有 9 个版本

高级搜索

QQ 群