Polyhedral extraction tool

N Vasilache, O Zinenko, T Theodoridis, P Goyal… - arXiv preprint arXiv …, 2018 - arxiv.org

Deep learning models with convolutional and recurrent networks are now ubiquitous and
analyze massive amounts of audio, image, video, text and graph data, with applications in …

被引用次数：523 相关文章所有 6 个版本

[PDF] uni-passau.de

Polly—performing polyhedral optimizations on a low-level intermediate representation

T Grosser, A Groesslinger, C Lengauer - Parallel Processing Letters, 2012 - World Scientific

The polyhedral model for loop parallelization has proved to be an effective tool for advanced
optimization and automatic parallelization of programs in higher-level languages. Yet, to …

被引用次数：758 相关文章所有 20 个版本

[PDF] tue.nl

Memory-centric accelerator design for convolutional neural networks

M Peemen, AAA Setio, B Mesman… - 2013 IEEE 31st …, 2013 - ieeexplore.ieee.org

In the near future, cameras will be used everywhere as flexible sensors for numerous
applications. For mobility and privacy reasons, the required image processing should be …

被引用次数：488 相关文章所有 10 个版本

[PDF] acm.org

Polyhedral parallel code generation for CUDA

S Verdoolaege, J Carlos Juega, A Cohen… - ACM Transactions on …, 2013 - dl.acm.org

This article addresses the compilation of a sequential program for parallel execution on a
modern GPU. To this end, we present a novel source-to-source compiler called PPCG …

被引用次数：522 相关文章所有 10 个版本

[PDF] mit.edu

Pencil: A platform-neutral compute intermediate language for accelerator programming

R Baghdadi, U Beaugnon, A Cohen… - 2015 International …, 2015 - ieeexplore.ieee.org

Programming accelerators such as GPUs with low-level APIs and languages such as
OpenCL and CUDA is difficult, error-prone, and not performance-portable. Automatic …

被引用次数：167 相关文章所有 23 个版本

[PDF] academia.edu

Hybrid hexagonal/classical tiling for GPUs

T Grosser, A Cohen, J Holewinski… - Proceedings of Annual …, 2014 - dl.acm.org

Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical
hyper-rectangular tiles cannot be used due to the combination of backward and forward …

被引用次数：158 相关文章所有 13 个版本

[PDF] acm.org

AN5D: automated stencil framework for high-degree temporal blocking on GPUs

K Matsumura, HR Zohouri, M Wahib, T Endo… - Proceedings of the 18th …, 2020 - dl.acm.org

Stencil computation is one of the most widely-used compute patterns in high performance
computing applications. Spatial and temporal blocking have been proposed to overcome the …

被引用次数：65 相关文章所有 5 个版本

[PDF] acm.org

Polyhedral AST generation is more than scanning polyhedra

T Grosser, S Verdoolaege, A Cohen - ACM Transactions on …, 2015 - dl.acm.org

Abstract mathematical representations such as integer polyhedra have been shown to be
useful to precisely analyze computational kernels and to express complex loop …

被引用次数：101 相关文章所有 7 个版本

[PDF] academia.edu

Split tiling for GPUs: automatic parallelization using trapezoidal tiles

T Grosser, A Cohen, PHJ Kelly, J Ramanujam… - Proceedings of the 6th …, 2013 - dl.acm.org

Tiling is a key technique to enhance data reuse. For computations structured as one
sequential outer" time" loop enclosing a set of parallel inner loops, tiling only the parallel …

被引用次数：116 相关文章所有 8 个版本

Diamond tiling: Tiling techniques to maximize parallelism for stencil computations

U Bondhugula, V Bandishti… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org

Most stencil computations allow tile-wise concurrent start, ie, there always exists a face of
the iteration space and a set of tiling directions such that all tiles along that face can be …

被引用次数：88 相关文章所有 6 个版本

高级搜索

QQ 群

Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions

Polly—performing polyhedral optimizations on a low-level intermediate representation

Memory-centric accelerator design for convolutional neural networks

Polyhedral parallel code generation for CUDA

Pencil: A platform-neutral compute intermediate language for accelerator programming

Hybrid hexagonal/classical tiling for GPUs

AN5D: automated stencil framework for high-degree temporal blocking on GPUs

Polyhedral AST generation is more than scanning polyhedra

Split tiling for GPUs: automatic parallelization using trapezoidal tiles

Diamond tiling: Tiling techniques to maximize parallelism for stencil computations

引用