A survey of accelerator architectures for deep neural networks

Y Chen, Y Xie, L Song, F Chen, T Tang - Engineering, 2020 - Elsevier
Recently, due to the availability of big data and the rapid growth of computing power,
artificial intelligence (AI) has regained tremendous attention and investment. Machine …

[HTML][HTML] A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives

B Peccerillo, M Mannino, A Mondelli… - Journal of Systems …, 2022 - Elsevier
In recent years, the limits of the multicore approach emerged in the so-called “dark silicon”
issue and diminishing returns of an ever-increasing core count. Hardware manufacturers …

A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology

V Seshadri, D Lee, T Mullins, H Hassan… - Proceedings of the 50th …, 2017 - dl.acm.org
Many important applications trigger bulk bitwise operations, ie, bitwise operations on large
bit vectors. In fact, recent works design techniques that exploit fast bulk bitwise operations to …

Pipelayer: A pipelined reram-based accelerator for deep learning

L Song, X Qian, H Li, Y Chen - 2017 IEEE international …, 2017 - ieeexplore.ieee.org
Convolution neural networks (CNNs) are the heart of deep learning applications. Recent
works PRIME [1] and ISAAC [2] demonstrated the promise of using resistive random access …

Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system

J Gómez-Luna, I El Hajj, I Fernandez… - IEEE …, 2022 - ieeexplore.ieee.org
Many modern workloads, such as neural networks, databases, and graph processing, are
fundamentally memory-bound. For such workloads, the data movement between main …

Drisa: A dram-based reconfigurable in-situ accelerator

S Li, D Niu, KT Malladi, H Zheng, B Brennan… - Proceedings of the 50th …, 2017 - dl.acm.org
Data movement between the processing units and the memory in traditional von Neumann
architecture is creating the" memory wall" problem. To bridge the gap, two approaches, the …

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

X Zou, S Xu, X Chen, L Yan, Y Han - Science China Information Sciences, 2021 - Springer
The “memory wall” problem or so-called von Neumann bottleneck limits the efficiency of
conventional computer architectures, which move data from memory to CPU for …

SIMDRAM: A framework for bit-serial SIMD processing using DRAM

N Hajinazar, GF Oliveira, S Gregorio… - Proceedings of the 26th …, 2021 - dl.acm.org
Processing-using-DRAM has been proposed for a limited set of basic operations (ie, logic
operations, addition). However, in order to enable full adoption of processing-using-DRAM …

Processing data where it makes sense: Enabling in-memory computation

O Mutlu, S Ghose, J Gómez-Luna… - Microprocessors and …, 2019 - Elsevier
Today's systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance …