Hardware based loop optimization for cgra architectures

C Sunny, S Das, KJM Martin, P Coussy - International Symposium on …, 2021 - Springer
With the increasing demand for high performance computing in application domains with
stringent power budgets, coarse-grained reconfigurable array (CGRA) architectures have …

A comparative study of multiprocessor list scheduling heuristics

G Liao, ER Altman, VK Agarwal… - 1994 Proceedings of the …, 1994 - ieeexplore.ieee.org
Many multiprocessor list scheduling heuristics that account for interprocessor
communication delay have been proposed in recent years. However, no uniform …

Loop overhead reduction techniques for coarse grained reconfigurable architectures

K Vadivel, M Wijtvliet, R Jordans… - … Conference on Digital …, 2017 - ieeexplore.ieee.org
Due to their flexibility and high performance, Coarse Grained Reconfigurable Array (CGRA)
are a topic of increasing research interest. However, CGRAs also have the potential to …

High level power and energy exploration using ArchC

T Gupta, C Bertolini, O Héron… - 2010 22nd …, 2010 - ieeexplore.ieee.org
With the increase in the design complexity of MPSoC architectures, estimating power
consumption is very complex and time consuming at lower level of abstraction. We propose …

[PDF][PDF] Replication Control for Distributed Real-Time Database Systems.

SH Son, S Kouloumbis - ICDCS, 1992 - researchgate.net
Schedulers for real-time distributed replicated databases must satisfy two requirements:
transactions should meet their timing constraints, and mutual consistency of replicated data …

Energy efficient hardware loop based optimization for CGRAs

C Sunny, S Das, KJM Martin, P Coussy - Journal of Signal Processing …, 2022 - Springer
Research interest and industry investment in edge computing solutions have increased
dramatically in recent years. Consequent quest for balanced performance, energy efficiency …

Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP

L Huang, N Xiao, Z Wang, Y Wang, M Lai - Parallel Computing, 2013 - Elsevier
Multimedia applications have become increasingly important in daily computing. These
applications are composed of heterogeneous regions of code mixed with data-level …

Derivation of efficient FSM from loop nests

T Yuki, A Morvan, S Derrien - 2013 International Conference on …, 2013 - ieeexplore.ieee.org
Pipelined execution is one of the most important optimizations in hardware design to
improve hardware utilization rate, and hence the throughput. Loop pipelining is a …

[图书][B] Multicore Technology: Architecture, Reconfiguration, and Modeling

MY Qadri, SJ Sangwine - 2018 - books.google.com
The saturation of design complexity and clock frequencies for single-core processors has
resulted in the emergence of multicore architectures as an alternative design paradigm …

Towards a Parameterizable cycle-accurate ISS in ArchC

C Bechara, N Ventroux… - ACS/IEEE International …, 2010 - ieeexplore.ieee.org
With the increase in the design complexity of MP-SoC architectures, flexible and accurate
processor simulators became a necessity for exploring the vast design space solutions. In …