Active memory cube: A processing-in-memory architecture for exascale systems R Nair, SF Antao, C Bertolli, P Bose, JR Brunheroto, T Chen, CY Cher, ... IBM Journal of Research and Development 59 (2/3), 17: 1-17: 14, 2015 | 229 | 2015 |
Op2: An active library framework for solving unstructured mesh-based applications on multi-core and many-core architectures GR Mudalige, MB Giles, I Reguly, C Bertolli, PHJ Kelly 2012 Innovative Parallel Computing (InPar), 1-12, 2012 | 104 | 2012 |
PyOP2: A high-level framework for performance-portable simulations on unstructured meshes F Rathgeber, GR Markall, L Mitchell, N Loriant, DA Ham, C Bertolli, ... 2012 SC Companion: High Performance Computing, Networking Storage and …, 2012 | 89 | 2012 |
Data access optimization in a processing-in-memory system Z Sura, A Jacob, T Chen, B Rosenburg, O Sallenave, C Bertolli, S Antao, ... Proceedings of the 12th ACM International Conference on Computing Frontiers, 1-8, 2015 | 86 | 2015 |
Offloading support for OpenMP in Clang and LLVM SF Antao, A Bataev, AC Jacob, GT Bercea, AE Eichenberger, G Rokos, ... 2016 Third Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), 1-11, 2016 | 84 | 2016 |
Coordinating GPU threads for OpenMP 4.0 in LLVM C Bertolli, SF Antao, AE Eichenberger, KOBZ Sura, AC Jacob, T Chen, ... 2014 LLVM Compiler Infrastructure in HPC, 12-21, 2014 | 80 | 2014 |
Acceleration of a full-scale industrial cfd application with op2 IZ Reguly, GR Mudalige, C Bertolli, MB Giles, A Betts, PHJ Kelly, ... IEEE Transactions on Parallel and Distributed Systems 27 (5), 1265-1278, 2015 | 68 | 2015 |
Integrating GPU support for OpenMP offloading directives into Clang C Bertolli, SF Antao, GT Bercea, AC Jacob, AE Eichenberger, T Chen, ... Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in …, 2015 | 66 | 2015 |
Next generation grids and wireless communication networks: towards a novel integrated approach R Fantacci, M Vanneschi, C Bertolli, G Mencagli, D Tarchi Wireless Communications and Mobile Computing 9 (4), 445-467, 2009 | 47 | 2009 |
Designing OP2 for GPU architectures MB Giles, GR Mudalige, B Spencer, C Bertolli, I Reguly Journal of Parallel and Distributed Computing 73 (11), 1451-1460, 2013 | 41 | 2013 |
Accessing global data from accelerator devices C Bertolli, JK O'brien, OH Sallenave, ZN Sura US Patent 9,513,828, 2016 | 36 | 2016 |
Performance analysis of OpenMP on a GPU using a CORAL proxy application GT Bercea, C Bertolli, SF Antao, AC Jacob, AE Eichenberger, T Chen, ... Proceedings of the 6th International Workshop on Performance Modeling …, 2015 | 36 | 2015 |
Design and performance of the op2 library for unstructured mesh applications C Bertolli, A Betts, G Mudalige, M Giles, P Kelly European Conference on Parallel Processing, 191-200, 2011 | 33 | 2011 |
Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems GR Mudalige, MB Giles, J Thiyagalingam, IZ Reguly, C Bertolli, PHJ Kelly, ... Parallel Computing 39 (11), 669-692, 2013 | 31 | 2013 |
Performance-portable finite element assembly using PyOP2 and FEniCS GR Markall, F Rathgeber, L Mitchell, N Loriant, C Bertolli, DA Ham, ... Supercomputing: 28th International Supercomputing Conference, ISC 2013 …, 2013 | 30 | 2013 |
Loop chaining: A programming abstraction for balancing locality and parallelism CD Krieger, MM Strout, C Olschanowsky, A Stone, S Guzik, X Gao, ... 2013 IEEE International Symposium on Parallel & Distributed Processing …, 2013 | 29 | 2013 |
Performance analysis and optimization of Clang's OpenMP 4.5 GPU support M Martineau, S McIntosh-Smith, C Bertolli, AC Jacob, SF Antao, ... 2016 7th International Workshop on Performance Modeling, Benchmarking and …, 2016 | 28 | 2016 |
Early experiences porting three applications to OpenMP 4.5 I Karlin, T Scogland, AC Jacob, SF Antao, GT Bercea, C Bertolli, ... OpenMP: Memory, Devices, and Tasks: 12th International Workshop on OpenMP …, 2016 | 26 | 2016 |
Generalizing run-time tiling with the loop chain abstraction MM Strout, F Luporini, CD Krieger, C Bertolli, GT Bercea, C Olschanowsky, ... 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014 | 24 | 2014 |
Expressing adaptivity and context awareness in the ASSISTANT programming model C Bertolli, D Buono, G Mencagli, M Vanneschi Autonomic Computing and Communications Systems: Third International ICST …, 2010 | 24 | 2010 |