E Berg, H Zeffer, E Hagersten - 2006 IEEE International …, 2006 - ieeexplore.ieee.org
The introduction of general-purpose microprocessors running multiple threads will put a focus on methods and tools helping a programmer to write efficient parallel applications …
Refactoring for data locality opens a new avenue for performance-oriented program rewriting. SLO has broken down a large part of the complexity that software developers face …
K Beyls, EH D'Hollander - … , HPCC 2006, Munich, Germany, September 13 …, 2006 - Springer
Due to the huge speed gaps in the memory hierarchy of modern computer architectures, it is important that programs maintain a good data locality. Improving temporal locality implies …
Information systems can be visualized with many tools. Typically these tools present functional artifacts from various phases of the development life-cycle; these include …
In SMT processors, the complex interplay between private and shared datapath resources needs to be considered in order to realize the full performance potential. In this paper, we …
MY Qadri, KD McDonald-Maier - EURASIP journal on embedded systems, 2009 - Springer
Abstract Most modern 16-bit and 32-bit embedded processors contain cache memories to further increase instruction throughput of the device. Embedded processors that contain …
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve …
M Kulkarni, V Pai, D Schuff - ACM SIGMETRICS Performance Evaluation …, 2011 - dl.acm.org
The prevalence of multicore architectures has made the performance analysis of multithreaded applications an intriguing area of inquiry. An understanding of locality effects …
C Fang, S Carr, S Önder, Z Wang - International Conference on Compiler …, 2006 - Springer
Profiling can effectively analyze program behavior and provide critical information for feedback-directed or dynamic optimizations. Based on memory profiling, reuse distance …