CRISP: critical slice prefetching

H Litz, G Ayers, P Ranganathan - Proceedings of the 27th ACM …, 2022 - dl.acm.org
The high access latency of DRAM continues to be a performance challenge for
contemporary microprocessor systems. Prefetching is a well-established technique to …

The forward slice core microarchitecture

K Lakshminarasimhan, A Naithani, J Feliu… - Proceedings of the …, 2020 - dl.acm.org
Superscalar out-of-order cores deliver high performance at the cost of increased complexity
and power budget. In-order cores, in contrast, are less complex and have a smaller power …

Delay and bypass: Ready and criticality aware instruction scheduling in out-of-order processors

M Alipour, S Kaxiras, D Black-Schaffer… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Flexible instruction scheduling is essential for performance in out-of-order processors. This
is typically achieved by using CAM-based Instruction Queues (IQs) that provide complete …

XPro: A cross-end processing architecture for data analytics in wearables

A Wang, L Chen, W Xu - ACM SIGARCH Computer Architecture News, 2017 - dl.acm.org
Wearable computing systems have spurred many opportunities to continuously monitor
human bodies with sensors worn on or implanted in the body. These emerging platforms …

STRAIGHT: Hazardless processor architecture without register renaming

H Irie, T Koizumi, A Fukuda, S Akaki… - 2018 51st Annual …, 2018 - ieeexplore.ieee.org
The single-thread performance of a processor improves the capability of the entire system by
reducing the critical path latency of programs. Typically, conventional superscalar …

CASINO core microarchitecture: Generating out-of-order schedules using cascaded in-order scheduling windows

I Jeong, S Park, C Lee, WW Ro - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
The performance gap between in-order (InO) and out-of-order (OoO) cores comes from the
ability to dynamically create highly optimized instruction issue schedules. In this work, we …

A case for a more effective, power-efficient turbo boosting

S Kondguli, M Huang - ACM Transactions on Architecture and Code …, 2018 - dl.acm.org
Single-thread performance and throughput often pose different design constraints and
require compromises. Mainstream CPUs today incorporate a non-trivial number of cores …

Clockhands: Rename-free Instruction Set Architecture for Out-of-order Processors

T Koizumi, R Shioya, S Sugita, T Amano… - Proceedings of the 56th …, 2023 - dl.acm.org
Out-of-order superscalar processors are currently the only architecture that speeds up
irregular programs, but they suffer from poor power efficiency. To tackle this issue, we …

Reconstructing Out-of-Order Issue Queue

I Jeong, J Lee, MK Yoon, WW Ro - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Out-of-order cores provide high performance at the cost of energy efficiency. Dynamic
scheduling is one of the major contributors to this: generating highly optimized issue …

Efficiently scaling out-of-order cores for simultaneous multithreading

FM Sleiman, TF Wenisch - ACM SIGARCH Computer Architecture News, 2016 - dl.acm.org
Simultaneous multithreading (SMT) out-of-order cores waste a significant portion of
structural out-of-order core resources on instructions that do not need them. These …