An overview of architecture-level power-and energy-efficient design techniques

I Ratković, N Bežanić, OS Ünsal, A Cristal… - Advances in …, 2015 - Elsevier
Power dissipation and energy consumption became the primary design constraint for almost
all computer systems in the last 15 years. Both computer architects and circuit designers …

Efficient execution of memory access phases using dataflow specialization

CH Ho, SJ Kim, K Sankaralingam - Proceedings of the 42nd annual …, 2015 - dl.acm.org
This paper identifies a new opportunity for improving the efficiency of a processor core:
memory access phases of programs. These are dynamic regions of programs where most of …

Frequent loop detection using efficient non-intrusive on-chip hardware

A Gordon-Ross, F Vahid - … of the 2003 international conference on …, 2003 - dl.acm.org
Dynamic software optimization methods are becoming increasingly popular for improving
software performance and power. The first step in dynamic optimization consists of detecting …

Improving the utilization of micro-operation caches in x86 processors

JB Kotra, J Kalamatianos - 2020 53rd Annual IEEE/ACM …, 2020 - ieeexplore.ieee.org
Most modern processors employ variable length, Complex Instruction Set Computing (CISC)
instructions to reduce instruction fetch energy cost and bandwidth requirements. High …

Exploiting fixed programs in embedded systems: A loop cache example

A Gordon-Ross, S Cotterell… - IEEE Computer …, 2002 - ieeexplore.ieee.org
Embedded systems commonly execute oneprogram for their lifetime. Designing embedded
systemarchitectures with configurable components, such thatthose components can be …

Compiler-guided leakage optimization for banked scratch-pad memories

M Kandemir, MJ Irwin, G Chen… - IEEE Transactions on …, 2005 - ieeexplore.ieee.org
Current trends indicate that leakage energy consumption will be an important concern in
upcoming process technologies. In this paper, we propose a compiler-based leakage …

CASINO core microarchitecture: Generating out-of-order schedules using cascaded in-order scheduling windows

I Jeong, S Park, C Lee, WW Ro - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
The performance gap between in-order (InO) and out-of-order (OoO) cores comes from the
ability to dynamically create highly optimized instruction issue schedules. In this work, we …

Compiler managed dynamic instruction placement in a low-power code cache

RA Ravindran, PD Nagarkar, GS Dasika… - … symposium on Code …, 2005 - ieeexplore.ieee.org
Modern embedded microprocessors use low power on-chip memories called scratch-pad
memories to store frequently executed instructions and data. Unlike traditional caches …

Improving the energy efficiency of big cores

K Czechowski, VW Lee, E Grochowski… - ACM SIGARCH …, 2014 - dl.acm.org
Traditionally, architectural innovations designed to boost single-threaded performance incur
overhead costs which significantly increase power consumption. In many cases the increase …

Dynamic scratch-pad memory management for irregular array access patterns

G Chen, O Ozturk, M Kandemir… - Proceedings of the …, 2006 - ieeexplore.ieee.org
There exist many embedded applications such as those executing on set-top boxes,
wireless base stations, HDTV, and mobile handsets that are structured as nested loops and …