StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures C Augonnet, S Thibault, R Namyst, PA Wacrenier Euro-Par 2009 Parallel Processing: 15th International Euro-Par Conference …, 2009 | 1993 | 2009 |
hwloc: A generic framework for managing hardware affinities in HPC applications F Broquedis, J Clet-Ortega, S Moreaud, N Furmento, B Goglin, G Mercier, ... 2010 18th Euromicro Conference on Parallel, Distributed and Network-based …, 2010 | 609 | 2010 |
A hybridization methodology for high-performance linear algebra software for GPUs E Agullo, C Augonnet, J Dongarra, H Ltaief, R Namyst, S Thibault, ... GPU Computing Gems Jade Edition, 473-484, 2012 | 162 | 2012 |
QR factorization on a multicore node enhanced with multiple GPU accelerators E Agullo, C Augonnet, J Dongarra, M Faverge, H Ltaief, S Thibault, ... 2011 IEEE International Parallel & Distributed Processing Symposium, 932-943, 2011 | 145 | 2011 |
Data-aware task scheduling on multi-accelerator based platforms C Augonnet, J Clet-Ortega, S Thibault, R Namyst 2010 IEEE 16th International Conference on Parallel and Distributed Systems …, 2010 | 123 | 2010 |
Achieving high performance on supercomputers with a sequential task-based programming model E Agullo, O Aumage, M Faverge, N Furmento, F Pruvost, M Sergent, ... IEEE Transactions on Parallel and Distributed Systems, 2017 | 116 | 2017 |
StarPU-MPI: Task programming over clusters of machines enhanced with accelerators C Augonnet, O Aumage, N Furmento, R Namyst, S Thibault Recent Advances in the Message Passing Interface: 19th European MPI Users …, 2012 | 104 | 2012 |
StarPU: a runtime system for scheduling tasks over accelerator-based multicore machines C Augonnet, S Thibault, R Namyst INRIA, 2010 | 98 | 2010 |
Automatic calibration of performance models on heterogeneous multicore architectures C Augonnet, S Thibault, R Namyst Euro-Par 2009–Parallel Processing Workshops: HPPC, HeteroPar, PROPER, ROIA …, 2010 | 85 | 2010 |
Structuring the execution of OpenMP applications for multicore architectures F Broquedis, O Aumage, B Goglin, S Thibault, PA Wacrenier, R Namyst 2010 IEEE International Symposium on Parallel & Distributed Processing …, 2010 | 84 | 2010 |
Evaluation of OpenMP dependent tasks with the KASTORS benchmark suite P Virouleau, P Brunet, F Broquedis, N Furmento, S Thibault, O Aumage, ... Using and Improving OpenMP for Devices, Tasks, and More: 10th International …, 2014 | 74 | 2014 |
Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes X Lacoste, M Faverge, G Bosilca, P Ramet, S Thibault 2014 IEEE International Parallel & Distributed Processing Symposium …, 2014 | 72 | 2014 |
Programmability and performance portability aspects of heterogeneous multi-/manycore systems C Kessler, U Dastgeer, S Thibault, R Namyst, A Richards, U Dolinsky, ... 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE …, 2012 | 59 | 2012 |
Faithful performance prediction of a dynamic task‐based runtime system for heterogeneous multi‐core architectures L Stanisic, S Thibault, A Legrand, B Videau, JF Méhaut Concurrency and Computation: Practice and Experience 27 (16), 4075-4090, 2015 | 52 | 2015 |
Building portable thread schedulers for hierarchical multiprocessors: The BubbleSched framework S Thibault, R Namyst, PA Wacrenier Euro-Par 2007 Parallel Processing, 42-51, 2007 | 49 | 2007 |
Scheduling dynamic OpenMP applications over multicore architectures F Broquedis, F Diakhaté, S Thibault, O Aumage, R Namyst, PA Wacrenier OpenMP in a New Era of Parallelism: 4th International Workshop, IWOMP 2008 …, 2008 | 43 | 2008 |
Improving performance by embedding HPC applications in lightweight Xen domains S Thibault, T Deegan Proceedings of the 2nd workshop on System-level virtualization for high …, 2008 | 39 | 2008 |
Controlling the memory subscription of distributed applications with a task-based runtime system M Sergent, D Goudin, S Thibault, O Aumage 2016 IEEE International Parallel and Distributed Processing Symposium …, 2016 | 36 | 2016 |
A visual performance analysis framework for task‐based parallel applications running on hybrid clusters V Garcia Pinto, L Mello Schnorr, L Stanisic, A Legrand, S Thibault, ... Concurrency and Computation: Practice and Experience 30 (18), e4472, 2018 | 34 | 2018 |
Flexible runtime support for efficient skeleton programming on heterogeneous GPU-based systems U Dastgeer, C Kessler, S Thibault Applications, Tools and Techniques on the Road to Exascale Computing, 159-166, 2012 | 31 | 2012 |