A review of near-memory computing architectures: Opportunities and challenges

G Singh, L Chelini, S Corda, AJ Awan… - 2018 21st Euromicro …, 2018 - ieeexplore.ieee.org
The conventional approach of moving stored data to the CPU for computation has become a
major performance bottleneck for emerging scale-out data-intensive applications due to their …

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

X Zou, S Xu, X Chen, L Yan, Y Han - Science China Information Sciences, 2021 - Springer
The “memory wall” problem or so-called von Neumann bottleneck limits the efficiency of
conventional computer architectures, which move data from memory to CPU for …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

Near-memory computing: Past, present, and future

G Singh, L Chelini, S Corda, AJ Awan, S Stuijk… - Microprocessors and …, 2019 - Elsevier
The conventional approach of moving data to the CPU for computation has become a
significant performance bottleneck for emerging scale-out data-intensive applications due to …

Napel: Near-memory computing application performance prediction via ensemble learning

G Singh, J Gómez-Luna, G Mariani… - Proceedings of the 56th …, 2019 - dl.acm.org
The cost of moving data between the memory/storage units and the compute units is a major
contributor to the execution time and energy consumption of modern workloads in …

Neurostream: Scalable and energy efficient deep learning with smart memory cubes

E Azarkhish, D Rossi, I Loi… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
High-performance computing systems are moving towards 2.5 D and 3D memory
hierarchies, based on High Bandwidth Memory (HBM) and Hybrid Memory Cube (HMC) to …

Concurrent data structures for near-memory computing

Z Liu, I Calciu, M Herlihy, O Mutlu - … of the 29th ACM Symposium on …, 2017 - dl.acm.org
The performance gap between memory and CPU has grown exponentially. To bridge this
gap, hardware architects have proposed near-memory computing (also called processing-in …

A scalable near-memory architecture for training deep neural networks on large in-memory datasets

F Schuiki, M Schaffner, FK Gürkaynak… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Most investigations into near-memory hardware accelerators for deep neural networks have
primarily focused on inference, while the potential of accelerating training has received …

Cairo: A compiler-assisted technique for enabling instruction-level offloading of processing-in-memory

R Hadidi, L Nai, H Kim, H Kim - ACM Transactions on Architecture and …, 2017 - dl.acm.org
Three-dimensional (3D)-stacking technology and the memory-wall problem have
popularized processing-in-memory (PIM) concepts again, which offers the benefits of …

PIMSim: A flexible and detailed processing-in-memory simulator

S Xu, X Chen, Y Wang, Y Han… - IEEE Computer …, 2018 - ieeexplore.ieee.org
With the advent of big data applications and new process technologies, Process-in-Memory
(PIM) attracts much attention in memory research as the architecture studies gradually shift …