A quantitative roofline model for GPU kernel performance estimation using micro-benchmarks and hardware metric profiling

E Konstantinidis, Y Cotronis - Journal of Parallel and Distributed Computing, 2017 - Elsevier
Typically, the execution time of a kernel on a GPU is a difficult to predict measure as it
depends on a wide range of factors. Performance can be limited by either memory transfer …

Multiobjective GPU design space exploration optimization

A Jooya, N Dimopoulos, A Baniasadi - Microprocessors and Microsystems, 2019 - Elsevier
It has been more than a decade since general porous applications targeted GPUs to benefit
from the enormous processing power they offer. However, not all applications gain speedup …

Scaling monte carlo tree search on intel xeon phi

SA Mirsoleimani, A Plaat… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
Many algorithms have been parallelized successfully on the Intel Xeon Phi coprocessor,
especially those with regular, balanced, and predictable data access patterns and …

Parallel monte carlo tree search from multi-core to many-core processors

SA Mirsoleimani, A Plaat… - 2015 IEEE Trustcom …, 2015 - ieeexplore.ieee.org
In recent years there has been much interest in the MCTS algorithm, a new, adaptive,
randomized optimization algorithm. In fields as diverse as Artificial Intelligence, Operations …

[PDF][PDF] Exploration of cyber-physical systems for GPGPU computer vision-based detection of biological viruses

P Libuschewski - 2017 - eldorado.tu-dortmund.de
This work presents a method for a computer vision-based detection of biological viruses in
PAMONO sensor images and, related to this, methods to explore cyber-physical systems …

Performance analysis of a 240 thread tournament level MCTS Go program on the Intel Xeon Phi

SA Mirsoleimani, A Plaat, J Vermaseren… - arXiv preprint arXiv …, 2014 - arxiv.org
In 2013 Intel introduced the Xeon Phi, a new parallel co-processor board. The Xeon Phi is a
cache-coherent many-core shared memory architecture claiming CPU-like versatility …

Multi-objective, energy-aware gpgpu design space exploration for medical or industrial applications

P Libuschewski, P Marwedel… - … Conference on Signal …, 2014 - ieeexplore.ieee.org
This work presents a multi-objective design space exploration for Graphics Processing Units
(GPUs). For any given GPGPU application, a Pareto front of best suited GPUs can be …

Optimum power-performance GPU configuration prediction based on code attributes

A Jooya, N Dimopoulos… - … Conference on High …, 2017 - ieeexplore.ieee.org
GPUs have been widely used in the past decade to speed up the execution of general
purpose applications with high level of parallelism. The efficiency of running general …

Multiversion parallel synthesis of digital structures based on SystemC specification

V Obrizan, T Soklakova - 2016 IEEE East-West Design & Test …, 2016 - ieeexplore.ieee.org
This paper presents a multivesion parallel synthesis of digital structures based on SystemC
specification. The purpose of which is a substantial reduction in design time computing …

Structured parallel programming for Monte Carlo tree search

SA Mirsoleimani, A Plaat, J Herik… - arXiv preprint arXiv …, 2017 - arxiv.org
In this paper, we present a new algorithm for parallel Monte Carlo tree search (MCTS). It is
based on the pipeline pattern and allows flexible management of the control flow of the …