D Shen, SL Song, A Li, X Liu - … of the 2018 International Symposium on …, 2018 - dl.acm.org
General-purpose GPUs have been widely utilized to accelerate parallel applications. Given a relatively complex programming model and fast architecture evolution, producing efficient …
S Dublish, V Nagarajan, N Topham - ACM Transactions on Architecture …, 2016 - dl.acm.org
The rise of general-purpose computing on GPUs has influenced architectural innovation on them. The introduction of an on-chip cache hierarchy is one such innovation. High L1 miss …
Graph representations of data are ubiquitous in analytic applications. However, graph workloads are notorious for having irregular memory access patterns with variable access …
Graphics processing units (GPUs) include a large amount of hardware resources for parallel thread executions. However, the resources are not fully utilized during runtime, and …
A Manocha, JL Aragón… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Despite their ubiquity in many important big-data applications, graph analytic kernels continue to challenge modern memory hierarchies due to their frequent, long-latency …
Modern GPUs suffer from cache contention due to the limited cache size that is shared across tens of concurrently running warps. To increase the per-warp cache size prior …
C Zhang, Y Zeng, X Guo - IEEE Transactions on Computers, 2019 - ieeexplore.ieee.org
A large fraction of the microprocessor energy is consumed by the data movement in the system. One of the reasons is the inefficiency in the conventional cache design. Cache …
Modern processors have a large processor-memory frequency gap, which urges the computer designer to address the issue of the inefficiency of the memory system …
D Wang, W Yu, RJ Stones, J Ren… - 2017 IEEE 23rd …, 2017 - ieeexplore.ieee.org
There are two inherent obstacles to effectively using Graphics Processing Units (GPUs) for query processing in search engines:(a) the highly restricted GPU memory space, and (b) the …