查看文章

作者

Devashree Tripathy, Amirali Abdolrashidi, Laxmi Narayan Bhuyan, Liang Zhou, Daniel Wong

发表日期

2021/6/8

期刊

ACM Transactions on Architecture and Code Optimization (TACO)

卷号

期号

页码范围

1-26

出版商

ACM

简介

The massive parallelism present in GPUs comes at the cost of reduced L1 and L2 cache sizes per thread, leading to serious cache contention problems such as thrashing. Hence, the data access locality of an application should be considered during thread scheduling to improve execution time and energy consumption. Recent works have tried to use the locality behavior of regular and structured applications in thread scheduling, but the difficult case of irregular and unstructured parallel applications remains to be explored.

We present PAVER, a Priority-Aware Vertex schedulER, which takes a graph-theoretic approach toward thread scheduling. We analyze the cache locality behavior among thread blocks (TBs) through a just-in-time compilation, and represent the problem using a graph representing the TBs and the locality among them. This graph is then partitioned to TB groups that display maximum data …

引用总数

被引用次数：28

202120222023202411 1 15 1

学术搜索中的文章

Paver: Locality graph-based thread block scheduling for gpus

D Tripathy, A Abdolrashidi, LN Bhuyan, L Zhou… - ACM Transactions on Architecture and Code …, 2021

被引用次数：28 相关文章所有 6 个版本