A survey of machine learning for computer architecture and systems

N Wu, Y Xie - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
It has been a long time that computer architecture and systems are optimized for efficient
execution of machine learning (ML) models. Now, it is time to reconsider the relationship …

Pond: Cxl-based memory pooling systems for cloud platforms

H Li, DS Berger, L Hsu, D Ernst, P Zardoshti… - Proceedings of the 28th …, 2023 - dl.acm.org
Public cloud providers seek to meet stringent performance requirements and low hardware
cost. A key driver of performance and cost is main memory. Memory pooling promises to …

Pythia: A customizable hardware prefetching framework using online reinforcement learning

R Bera, K Kanellopoulos, A Nori, T Shahroodi… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Past research has proposed numerous hardware prefetching techniques, most of which rely
on exploiting one specific type of program context information (eg, program counter …

Decoupled vector runahead

A Naithani, J Roelandts, S Ainsworth… - Proceedings of the 56th …, 2023 - dl.acm.org
We present Decoupled Vector Runahead (DVR), an in-core prefetching technique,
executing separately to the main application thread, that exploits massive amounts of …

APT-GET: profile-guided timely software prefetching

S Jamilan, TA Khan, G Ayers, B Kasikci… - Proceedings of the …, 2022 - dl.acm.org
Prefetching which predicts future memory accesses and preloads them from main memory,
is a widely-adopted technique to overcome the processor-memory performance gap …

Fine-grained address segmentation for attention-based variable-degree prefetching

P Zhang, A Srivastava, AV Nori, R Kannan… - Proceedings of the 19th …, 2022 - dl.acm.org
Machine learning algorithms have shown potential to improve prefetching performance by
accurately predicting future memory accesses. Existing approaches are based on the …

Cache in hand: Expander-driven cxl prefetcher for next generation cxl-ssd

M Kwon, S Lee, M Jung - Proceedings of the 15th ACM Workshop on …, 2023 - dl.acm.org
Integrating compute express link (CXL) with SSDs allows scalable access to large memory
but has slower speeds than DRAMs. We present ExPAND, an expander-driven CXL …

Micro-armed bandit: lightweight & reusable reinforcement learning for microarchitecture decision-making

G Gerogiannis, J Torrellas - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Online Reinforcement Learning (RL) has been adopted as an effective mechanism in
various decision-making problems in microarchitecture. Its high adaptability and the ability to …

HoPP: Hardware-Software Co-Designed Page Prefetching for Disaggregated Memory

H Li, K Liu, T Liang, Z Li, T Lu, H Yuan… - … Symposium on High …, 2023 - ieeexplore.ieee.org
Memory disaggregation is a promising direction to mitigate memory contention in
datacenters. To make memory disaggregation practical, prior efforts expose remote memory …

Berti: an accurate local-delta data prefetcher

A Navarro-Torres, B Panda… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Data prefetching is a technique that plays a crucial role in modern high-performance
processors by hiding long latency memory accesses. Several state-of-the-art hardware …