TOP-PIM: Throughput-oriented programmable processing in memory

D Zhang, N Jayasena, A Lyashevsky… - Proceedings of the 23rd …, 2014 - dl.acm.org
As computation becomes increasingly limited by data movement and energy consumption,
exploiting locality throughout the memory hierarchy becomes critical to continued …

Chameleon: Versatile and practical near-DRAM acceleration architecture for large memory systems

H Asghari-Moghaddam, YH Son… - 2016 49th annual …, 2016 - ieeexplore.ieee.org
The performance of computer systems is often limited by the bandwidth of their memory
channels, but further increasing the bandwidth is challenging under the stringent pin and …

Data-centric computing frontiers: A survey on processing-in-memory

P Siegl, R Buchty, M Berekovic - Proceedings of the Second …, 2016 - dl.acm.org
A major shift from compute-centric to data-centric computing systems can be perceived, as
novel big data workloads like cognitive computing and machine learning strongly enforce …

Demystifying cxl memory with genuine cxl-ready systems and devices

Y Sun, Y Yuan, Z Yu, R Kuper, C Song… - Proceedings of the 56th …, 2023 - dl.acm.org
The ever-growing demands for memory with larger capacity and higher bandwidth have
driven recent innovations on memory expansion and disaggregation technologies based on …

Exploring the performance benefit of hybrid memory system on HPC environments

IB Peng, R Gioiosa, G Kestor, P Cicotti… - 2017 IEEE …, 2017 - ieeexplore.ieee.org
Hardware accelerators have become a de-facto standard to achieve high performance on
current supercomputers and there are indications that this trend will increase in the future …

BATMAN: Techniques for maximizing system bandwidth of memory systems with stacked-DRAM

C Chou, A Jaleel, M Qureshi - Proceedings of the International …, 2017 - dl.acm.org
Tiered-memory systems consist of high-bandwidth 3D-DRAM and high-capacity commodity-
DRAM. Conventional designs attempt to improve system performance by maximizing the …

Farewell my shared llc! a case for private die-stacked dram caches for servers

A Shahab, M Zhu, A Margaritov… - 2018 51st Annual IEEE …, 2018 - ieeexplore.ieee.org
The slowdown in technology scaling mandates rethinking of conventional CPU architectures
in a quest for higher performance and new capabilities. This work takes a step in this …

Characterizing the performance benefit of hybrid memory system for HPC applications

IB Peng, R Gioiosa, G Kestor, JS Vetter, P Cicotti… - Parallel Computing, 2018 - Elsevier
Heterogenous memory systems that consist of multiple memory technologies are becoming
common in high-performance computing environments. Modern processors and …

Exploring the processing-in-memory design space

M Scrbak, M Islam, KM Kavi, M Ignatowski… - Journal of Systems …, 2017 - Elsevier
With the emergence of 3D-DRAM, Processing-in-Memory has once more become of great
interest to the research community and industry. Here we present our observations on a …

Genetic Cache: A Machine Learning Approach to Designing DRAM Cache Controllers in HBM Systems

M Amouzegar, M Rezaalipour… - ACM Journal on Emerging …, 2024 - dl.acm.org
DRAM memory controller plays a critical role in maximizing the performance of high
bandwidth memory by efficiently managing data transfers between the CPU and the memory …