This work observes that a large fraction of the computations performed by Deep Neural Networks (DNNs) are intrinsically ineffectual as they involve a multiplication where one of …
General-purpose GPUs (GPGPUs) are becoming prevalent in mainstream computing, and performance per watt has emerged as a more crucial evaluation metric than peak …
This paper studies the effects of hardware thread scheduling on cache management in GPUs. We propose Cache-Conscious Wave front Scheduling (CCWS), an adaptive …
GPUs have been used to accelerate many regular applications and, more recently, irregular applications in which the control flow and memory access patterns are data-dependent and …
V Narasiman, M Shebanow, CJ Lee… - Proceedings of the 44th …, 2011 - dl.acm.org
Due to their massive computational power, graphics processing units (GPUs) have become a popular platform for executing general purpose parallel applications. GPU programming …
Emerging GPGPU architectures, along with programming models like CUDA and OpenCL, offer a cost-effective platform for many applications by providing high thread level …
General-purpose graphics processing units (GPG-PUs) are at their best in accelerating computation by exploiting abundant thread-level parallelism (TLP) offered by many classes …
The Single Instruction Multiple Data (SIMD) architecture, supported by various high- performance computing platforms, efficiently utilizes data-level parallelism. The SIMD model …
B Pichai, L Hsu, A Bhattacharjee - ACM SIGARCH Computer Architecture …, 2014 - dl.acm.org
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, necessitates a manageable programming model to ensure widespread adoption …