[HTML][HTML] A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives

B Peccerillo, M Mannino, A Mondelli… - Journal of Systems …, 2022 - Elsevier
In recent years, the limits of the multicore approach emerged in the so-called “dark silicon”
issue and diminishing returns of an ever-increasing core count. Hardware manufacturers …

An overview of processing-in-memory circuits for artificial intelligence and machine learning

D Kim, C Yu, S Xie, Y Chen, JY Kim… - IEEE Journal on …, 2022 - ieeexplore.ieee.org
Artificial intelligence (AI) and machine learning (ML) are revolutionizing many fields of study,
such as visual recognition, natural language processing, autonomous vehicles, and …

A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system

J Gómez-Luna, I El Hajj, I Fernandez… - IEEE …, 2022 - ieeexplore.ieee.org
Many modern workloads, such as neural networks, databases, and graph processing, are
fundamentally memory-bound. For such workloads, the data movement between main …

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology

X Zou, S Xu, X Chen, L Yan, Y Han - Science China Information Sciences, 2021 - Springer
The “memory wall” problem or so-called von Neumann bottleneck limits the efficiency of
conventional computer architectures, which move data from memory to CPU for …

SIMDRAM: A framework for bit-serial SIMD processing using DRAM

N Hajinazar, GF Oliveira, S Gregorio… - Proceedings of the 26th …, 2021 - dl.acm.org
Processing-using-DRAM has been proposed for a limited set of basic operations (ie, logic
operations, addition). However, in order to enable full adoption of processing-using-DRAM …

GenASM: A high-performance, low-power approximate string matching acceleration framework for genome sequence analysis

DS Cali, GS Kalsi, Z Bingöl, C Firtina… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Genome sequence analysis has enabled significant advancements in medical and scientific
areas such as personalized medicine, outbreak tracing, and the understanding of evolution …

Processing-in-memory: A workload-driven perspective

S Ghose, A Boroumand, JS Kim… - IBM Journal of …, 2019 - ieeexplore.ieee.org
Many modern and emerging applications must process increasingly large volumes of data.
Unfortunately, prevalent computing paradigms are not designed to efficiently handle such …

Google neural network models for edge devices: Analyzing and mitigating machine learning inference bottlenecks

A Boroumand, S Ghose, B Akin… - 2021 30th …, 2021 - ieeexplore.ieee.org
Emerging edge computing platforms often contain machine learning (ML) accelerators that
can accelerate inference for a wide range of neural network (NN) models. These models are …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …