[PDF][PDF] Compiler-driven reconfiguration of multiprocessors

M Hußmann, M Thies, U Kastens… - … of the Workshop …, 2007 - fg-kastens.cs.uni-paderborn.de
M Hußmann, M Thies, U Kastens, M Purnaprajna, M Porrmann, U Rückert
Proceedings of the Workshop on Application Specific …, 2007fg-kastens.cs.uni-paderborn.de
Multiprocessors enable parallel execution of a single large application to achieve a
performance improvement. An application is split at instruction, data or task level (based on
the granularity), such that the overhead of partitioning is minimal. Parallelization for
multiprocessors is mostly restricted to a fixed granularity. Reconfiguration enables
architectural variations to allow multiple granularities of operation within a multiprocessor.
This adaptability optimizes resource utilization over a fixed organization. Here, a unified …
Abstract
Multiprocessors enable parallel execution of a single large application to achieve a performance improvement. An application is split at instruction, data or task level (based on the granularity), such that the overhead of partitioning is minimal. Parallelization for multiprocessors is mostly restricted to a fixed granularity. Reconfiguration enables architectural variations to allow multiple granularities of operation within a multiprocessor. This adaptability optimizes resource utilization over a fixed organization. Here, a unified hardware-software approach to design a reconfigurable multiprocessor system called QuadroCore is presented. In our holistic methodology, compiler-driven reconfiguration selects from a fixed set of modes. Each mode relies on matching program analysis to exploit the architecture efficiently. For instance, a multiprocessor may adapt to different parallelization paradigms. The compiler can determine the best execution mode for each piece of code by analyzing the parallelism in a program. A fast, singlecycle, run-time reconfiguration between these predetermined modes is enabled by executing special instructions which switch coarse-grained components like instruction decoders, ALUs and register banks. Performance is evaluated in terms of execution cycles and achieved clock frequency. First results indicate suitability especially in audio and video processing applications.
fg-kastens.cs.uni-paderborn.de
以上显示的是最相近的搜索结果。 查看全部搜索结果