A survey on techniques for cooperative CPU-GPU computing

K Raju, NN Chiplunkar - Sustainable Computing: Informatics and Systems, 2018 - Elsevier
Abstract Graphical Processing Unit provides massive parallelism due to the presence of
hundreds of cores. Usage of GPUs for general purpose computation (GPGPU) has resulted …

High-performance packet classification on GPU

S Zhou, SG Singapura… - 2014 IEEE High …, 2014 - ieeexplore.ieee.org
Multi-field packet classification is a network kernel function where packets are classified and
routed based on a predefined rule set. Recently, there has been a new trend in exploring …

A survey on parallel computing for traditional computer vision

D Jaiswal, P Kumar - Concurrency and Computation: Practice …, 2022 - Wiley Online Library
The applications of computer vision (CV) are continuously increasing along with the
enormous demand for real‐time data processing. This visual data processing is done with …

COX: Exposing CUDA warp-level functions to CPUs

R Han, J Lee, J Sim, H Kim - ACM Transactions on Architecture and …, 2022 - dl.acm.org
As CUDA becomes the de facto programming language among data parallel applications
such as high-performance computing or machine learning applications, running CUDA on …

Cupbop: Making cuda a portable language

R Han, J Chen, B Garg, X Zhou, J Lu, J Young… - ACM Transactions on …, 2024 - dl.acm.org
CUDA is designed specifically for NVIDIA GPUs and is not compatible with non-NVIDIA
devices. Enabling CUDA execution on alternative backends could greatly benefit the …

An effective clustering algorithm to index high dimensional metric spaces

E Chávez, G Navarro - Proceedings Seventh International …, 2000 - ieeexplore.ieee.org
A metric space consists of a collection of objects and a distance function defined among
them, which satisfies the triangular inequality. The goal is to preprocess the set so that, given …

A unified CPU-GPU protocol for GNN training

YC Lin, G Deng, V Prasanna - Proceedings of the 21st ACM International …, 2024 - dl.acm.org
Training a Graph Neural Network (GNN) model on large-scale graphs involves a high
volume of data communication and computations. While state-of-the-art CPUs and GPUs …

Performance improvement of CUDA applications by reducing CPU-GPU data transfer overhead

NV Sunitha, K Raju… - … international conference on …, 2017 - ieeexplore.ieee.org
In a CPU-GPU based heterogeneous computing system, the input data to be processed by
the kernel resides in the host memory. The host and the device memory address spaces are …

Parallel CPU–GPU computing technique for discrete element method

V Skorych, M Dosta - Concurrency and Computation: Practice …, 2022 - Wiley Online Library
The efficiency of the simulations with the discrete element method (DEM) is significantly
improved using a novel computational strategy. The new method is developed with a focus …

A high performance implementation of spectral clustering on cpu-gpu platforms

Y Jin, JF Jaja - 2016 IEEE International Parallel and Distributed …, 2016 - ieeexplore.ieee.org
Spectral clustering is one of the most popular graph clustering algorithms, which achieves
the best performance for many scientific and engineering applications. However, existing …