Compute-in-memory chips for deep learning: Recent trends and prospects

S Yu, H Jiang, S Huang, X Peng… - IEEE circuits and systems …, 2021 - ieeexplore.ieee.org
Compute-in-memory (CIM) is a new computing paradigm that addresses the memory-wall
problem in hardware accelerator design for deep learning. The input vector and weight …

[HTML][HTML] Deep learning for geological hazards analysis: Data, models, applications, and opportunities

Z Ma, G Mei - Earth-Science Reviews, 2021 - Elsevier
As natural disasters are induced by geodynamic activities or abnormal changes in the
environment, geological hazards tend to wreak havoc on the environment and human …

Hardware architecture and software stack for PIM based on commercial DRAM technology: Industrial product

S Lee, S Kang, J Lee, H Kim, E Lee… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Emerging applications such as deep neural network demand high off-chip memory
bandwidth. However, under stringent physical constraints of chip packages and system …

Simba: Scaling deep-learning inference with multi-chip-module-based architecture

YS Shao, J Clemons, R Venkatesan, B Zimmer… - Proceedings of the …, 2019 - dl.acm.org
Package-level integration using multi-chip-modules (MCMs) is a promising approach for
building large-scale systems. Compared to a large monolithic die, an MCM combines many …

A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Efficient processing of deep neural networks: A tutorial and survey

V Sze, YH Chen, TJ Yang, JS Emer - Proceedings of the IEEE, 2017 - ieeexplore.ieee.org
Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI)
applications including computer vision, speech recognition, and robotics. While DNNs …

PUMA: A programmable ultra-efficient memristor-based accelerator for machine learning inference

A Ankit, IE Hajj, SR Chalamalasetti, G Ndu… - Proceedings of the …, 2019 - dl.acm.org
Memristor crossbars are circuits capable of performing analog matrix-vector multiplications,
overcoming the fundamental energy efficiency limitations of digital logic. They have been …

SCNN: An accelerator for compressed-sparse convolutional neural networks

A Parashar, M Rhu, A Mukkara, A Puglielli… - ACM SIGARCH …, 2017 - dl.acm.org
Convolutional Neural Networks (CNNs) have emerged as a fundamental technology for
machine learning. High performance and extreme energy efficiency are critical for …

Beyond data and model parallelism for deep neural networks.

Z Jia, M Zaharia, A Aiken - Proceedings of Machine Learning …, 2019 - proceedings.mlsys.org
Existing deep learning systems commonly parallelize deep neural network (DNN) training
using data or model parallelism, but these strategies often result in suboptimal …

Timeloop: A systematic approach to dnn accelerator evaluation

A Parashar, P Raina, YS Shao, YH Chen… - … analysis of systems …, 2019 - ieeexplore.ieee.org
This paper presents Timeloop, an infrastructure for evaluating and exploring the architecture
design space of deep neural network (DNN) accelerators. Timeloop uses a concise and …