Bouquet of instruction pointers: Instruction pointer classifier-based spatial hardware prefetching

S Pakalapati, B Panda - 2020 ACM/IEEE 47th Annual …, 2020 - ieeexplore.ieee.org
Hardware prefetching is one of the common off-chip DRAM latency hiding techniques.
Though hardware prefetchers are ubiquitous in the commercial machines and prefetching …

Clip: Load criticality based data prefetching for bandwidth-constrained many-core systems

B Panda - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Hardware prefetching is a latency-hiding technique that hides the costly off-chip DRAM
accesses. However, state-of-the-art prefetchers fail to deliver performance improvement in …

Dspatch: Dual spatial pattern prefetcher

R Bera, AV Nori, O Mutlu, S Subramoney - … of the 52nd Annual IEEE/ACM …, 2019 - dl.acm.org
High main memory latency continues to limit performance of modern high-performance out-
of-order cores. While DRAM latency has remained nearly the same over many generations …

Berti: an accurate local-delta data prefetcher

A Navarro-Torres, B Panda… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Data prefetching is a technique that plays a crucial role in modern high-performance
processors by hiding long latency memory accesses. Several state-of-the-art hardware …

A prefetch control strategy based on improved hill-climbing method in asymmetric multi-core architecture

J Fang, Y Xu, H Kong, M Cai - The Journal of Supercomputing, 2023 - Springer
Cache prefetching is a traditional way to reduce memory access latency. In multi-core
systems, aggressive prefetching may harm the system. In the past, prefetching throttling …

Hyperion: A Highly Effective Page and PC Based Delta Prefetcher

Y Cui, W Chen, X Cheng, J Yi - ACM Transactions on Architecture and …, 2024 - dl.acm.org
Hardware prefetching plays an important role in modern processors for hiding memory
access latency. Delta prefetchers show great potential at the L1D cache level, as they can …

Bandwidth-aware dynamic prefetch configuration for IBM POWER8

C Navarro, J Feliu, S Petit, ME Gomez… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Advanced hardware prefetch engines are being integrated in current high-performance
processors. Prefetching can boost the performance of most applications, however, the …

Multi-Strided Access Patterns to Boost Hardware Prefetching

MO Blom, KFD Rietveld, RV van Nieuwpoort - arXiv preprint arXiv …, 2024 - arxiv.org
Important memory-bound kernels, such as linear algebra, convolutions, and stencils, rely on
SIMD instructions as well as optimizations targeting improved vectorized data traversal and …

Novel Method for Verification and Performance Evaluation of a Non-Blocking Level-1 Instruction Cache designed for Out-of-Order RISC-V Superscaler Processor on …

V Desalphine, S Dashora, L Mali… - … Symposium on VLSI …, 2020 - ieeexplore.ieee.org
Performance of instruction cache has become an important factor in enhancing the overall
performance of a system. This paper describes a novel method to evaluate the performance …

CDPM: Context-directed pattern matching prefetching to improve coarse-grained reconfigurable array performance

L Liu, C Yang, S Yin, S Wei - IEEE Transactions on Computer …, 2017 - ieeexplore.ieee.org
Coarse-grained reconfigurable arrays (CGRAs) can be dynamically programmed by
configuration contexts to concurrently run multiple operations on a processing elements …