RAELLA: Reforming the arithmetic for efficient, low-resolution, and low-loss analog PIM: No retraining required!

T Andrulis, JS Emer, V Sze - … of the 50th Annual International Symposium …, 2023 - dl.acm.org
Processing-In-Memory (PIM) accelerators have the potential to efficiently run Deep Neural
Network (DNN) inference by reducing costly data movement and by using resistive RAM …

Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures

C Giannoula, I Fernandez, J Gómez-Luna… - ACM SIGMETRICS …, 2022 - dl.acm.org
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …

Ant: Exploiting adaptive numerical data type for low-bit deep neural network quantization

C Guo, C Zhang, J Leng, Z Liu, F Yang… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Quantization is a technique to reduce the computation and memory cost of DNN models,
which are getting increasingly large. Existing quantization solutions use fixed-point integer …

Sparse attention acceleration with synergistic in-memory pruning and on-chip recomputation

A Yazdanbakhsh, A Moradifirouzabadi… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
As its core computation, a self-attention mechanism gauges pairwise correlations across the
entire input sequence. Despite favorable performance, calculating pairwise correlations is …

On the accuracy of analog neural network inference accelerators

TP Xiao, B Feinberg, CH Bennett… - IEEE Circuits and …, 2022 - ieeexplore.ieee.org
Specialized accelerators have recently garnered attention as a method to reduce the power
consumption of neural network inference. A promising category of accelerators utilizes …

[HTML][HTML] Survey of Deep Learning Accelerators for Edge and Emerging Computing

S Alam, C Yakopcic, Q Wu, M Barnell, S Khan… - Electronics, 2024 - mdpi.com
The unprecedented progress in artificial intelligence (AI), particularly in deep learning
algorithms with ubiquitous internet connected smart devices, has created a high demand for …

Genpip: In-memory acceleration of genome analysis via tight integration of basecalling and read mapping

H Mao, M Alser, M Sadrosadati, C Firtina… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Nanopore sequencing is a widely-used high-throughput genome sequencing technology
that can sequence long fragments of a genome into raw electrical signals at low cost …

Tandem processor: Grappling with emerging operators in neural networks

S Ghodrati, S Kinzer, H Xu, R Mahapatra… - Proceedings of the 29th …, 2024 - dl.acm.org
With the ever increasing prevalence of neural networks and the upheaval from the language
models, it is time to rethink neural acceleration. Up to this point, the broader research …

Inca: Input-stationary dataflow at outside-the-box thinking about deep learning accelerators

B Kim, S Li, H Li - 2023 IEEE International Symposium on High …, 2023 - ieeexplore.ieee.org
This paper first presents an input-stationary (IS) implemented crossbar accelerator (INCA),
supporting inference and training for deep neural networks (DNNs). Processing-in-memory …

Era-bs: Boosting the efficiency of reram-based pim accelerator with fine-grained bit-level sparsity

F Liu, W Zhao, Z Wang, Y Chen… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Resistive Random-Access-Memory (ReRAM) crossbar is one of the most promising neural
network accelerators, thanks to its in-memory and in-situ analog computing abilities for …