Enabling the adoption of processing-in-memory: Challenges, mechanisms, future research directions

S Ghose, K Hsieh, A Boroumand… - arXiv preprint arXiv …, 2018 - arxiv.org
Poor DRAM technology scaling over the course of many years has caused DRAM-based
main memory to increasingly become a larger system bottleneck. A major reason for the …

The processing-in-memory paradigm: Mechanisms to enable adoption

S Ghose, K Hsieh, A Boroumand… - … -CMOS Technologies for …, 2019 - Springer
Performance improvements from DRAM technology scaling have been lagging behind the
improvements from logic technology scaling for many years. As application demand for main …

Hardware architecture and software stack for PIM based on commercial DRAM technology: Industrial product

S Lee, S Kang, J Lee, H Kim, E Lee… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Emerging applications such as deep neural network demand high off-chip memory
bandwidth. However, under stringent physical constraints of chip packages and system …

Design of processing-“inside”-memory optimized for dram behaviors

WJ Lee, CH Kim, Y Paik, J Park, I Park, SW Kim - IEEE Access, 2019 - ieeexplore.ieee.org
The computing domain of today's computer systems is moving very fast from arithmetic to
data processing as data volumes grow exponentially. As a result, processing-in-memory …

A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Roc: Dram-based processing with reduced operation cycles

X Xin, Y Zhang, J Yang - Proceedings of the 56th Annual Design …, 2019 - dl.acm.org
DRAM based memory-centric computing architectures are promising solutions to tackle the
challenges of memory wall. In this paper, we develop a novel design of DRAM-based …

Enabling practical processing in and near memory for data-intensive computing

O Mutlu, S Ghose, J Gómez-Luna… - Proceedings of the 56th …, 2019 - dl.acm.org
Modern computing systems suffer from the dichotomy between computation on one side,
which is performed only in the processor (and accelerators), and data storage/movement on …

Reducing DRAM latency at low cost by exploiting heterogeneity

D Lee - arXiv preprint arXiv:1604.08041, 2016 - arxiv.org
In modern systems, DRAM-based main memory is significantly slower than the processor.
Consequently, processors spend a long time waiting to access data from main memory …

Processing data where it makes sense: Enabling in-memory computation

O Mutlu, S Ghose, J Gómez-Luna… - Microprocessors and …, 2019 - Elsevier
Today's systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance …

Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology

V Seshadri, D Lee, T Mullins, H Hassan… - Proceedings of the 50th …, 2017 - dl.acm.org
Many important applications trigger bulk bitwise operations, ie, bitwise operations on large
bit vectors. In fact, recent works design techniques that exploit fast bulk bitwise operations to …