[PDF][PDF] Solutions for optimizing the data parallel prefix sum algorithm using the Compute Unified Device Architecture

I Lungu, A Pîrjan, DM Petroşanu - Journal of Information Systems & …, 2011 - core.ac.uk
Journal of Information Systems & Operations Management, 2011core.ac.uk
In this paper, we analyze solutions for optimizing the data parallel prefix sum function using
the Compute Unified Device Architecture (CUDA) that provides a viable solution for
accelerating a broad class of applications. The parallel prefix sum function is an essential
building block for many data mining algorithms, and therefore its optimization facilitates the
whole data mining process. Finally, we benchmark and evaluate the performance of the
optimized parallel prefix sum building block in CUDA.
Abstract
In this paper, we analyze solutions for optimizing the data parallel prefix sum function using the Compute Unified Device Architecture (CUDA) that provides a viable solution for accelerating a broad class of applications. The parallel prefix sum function is an essential building block for many data mining algorithms, and therefore its optimization facilitates the whole data mining process. Finally, we benchmark and evaluate the performance of the optimized parallel prefix sum building block in CUDA.
core.ac.uk
以上显示的是最相近的搜索结果。 查看全部搜索结果