Systematic energy characterization of CMP/SMT processor systems via automated micro-benchmarks

R Bertran, A Buyuktosunoglu, MS Gupta… - 2012 45th Annual …, 2012 - ieeexplore.ieee.org
Microprocessor-based systems today are composed of multi-core, multi-threaded
processors with complex cache hierarchies and gigabytes of main memory. Accurate …

Combining relations and text in scientific network clustering

D Combe, C Largeron… - 2012 IEEE/ACM …, 2012 - ieeexplore.ieee.org
In this paper, we present different combined clustering methods and we evaluate their
performances and their results on a dataset with ground truth. This dataset, built from several …

Synchronizing namespaces with invertible bloom filters

W Fu, HB Abraham, P Crowley - 2015 ACM/IEEE Symposium …, 2015 - ieeexplore.ieee.org
Data synchronization-long a staple in le systems-is emerging as a signicant communications
primitive. In a distributed system, data synchronization resolves di erences among …

Automatic generation of models of microarchitectures

A Abel - 2020 - publikationen.sulb.uni-saarland.de
Detailed microarchitectural models are necessary to predict, explain, or optimize the
performance of software running on modern microprocessors. Building such models often …

Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite

J González-Domínguez, GL Taboada… - Computers & Electrical …, 2012 - Elsevier
Servet is a suite of benchmarks focused on detecting a set of parameters with high influence
on the overall performance of multicore systems. These parameters can be used for …

Memory aware load balance strategy on a parallel branch‐and‐bound application

JMN Silva, C Boeres, LMA Drummond… - Concurrency and …, 2015 - Wiley Online Library
The latest trends in high performance computing systems show an increasing demand on
the use of a large scale multicore system in an efficient way so that high compute‐intensive …

UPCBLAS: a library for parallel matrix computations in Unified Parallel C

J González‐Domínguez, MJ Martín… - Concurrency and …, 2012 - Wiley Online Library
SUMMARY The popularity of Partitioned Global Address Space (PGAS) languages has
increased during the last years thanks to their high programmability and performance …

Effortless monitoring of arithmetic intensity with papi's counter analysis toolkit

D Barry, A Danalis, H Jagode - … Computing 2018/2019: Proceedings of the …, 2021 - Springer
With exascale computing forthcoming, performance metrics such as memory traffic and
arithmetic intensity are increasingly important for codes that heavily utilize numerical …

[PDF][PDF] Performance analysis of complex shared memory systems

D Molka - 2017 - tu-dresden.de
The goal of this thesis is to improve the understanding of the achieved application
performance on existing hardware. It can be observed that the scaling of parallel …

Measurement of the latency parameters of the Multi-BSP model: a multicore benchmarking approach

A Savadi, H Deldari - The Journal of Supercomputing, 2014 - Springer
Computer benchmarking is a common method for measuring the parameters of a
computational model. It helps to measure the parameters of any computer. With the …