Survey and benchmarking of machine learning accelerators

A Reuther, P Michaleas, M Jones… - 2019 IEEE high …, 2019 - ieeexplore.ieee.org
Advances in multicore processors and accelerators have opened the flood gates to greater
exploration and application of machine learning techniques to a variety of applications …

The deep learning compiler: A comprehensive survey

M Li, Y Liu, X Liu, Q Sun, X You, H Yang… - … on Parallel and …, 2020 - ieeexplore.ieee.org
The difficulty of deploying various deep learning (DL) models on diverse DL hardware has
boosted the research and development of DL compilers in the community. Several DL …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

PUMA: A programmable ultra-efficient memristor-based accelerator for machine learning inference

A Ankit, IE Hajj, SR Chalamalasetti, G Ndu… - Proceedings of the …, 2019 - dl.acm.org
Memristor crossbars are circuits capable of performing analog matrix-vector multiplications,
overcoming the fundamental energy efficiency limitations of digital logic. They have been …

Hardware implementation of deep network accelerators towards healthcare and biomedical applications

MR Azghadi, C Lammie, JK Eshraghian… - … Circuits and Systems, 2020 - ieeexplore.ieee.org
The advent of dedicated Deep Learning (DL) accelerators and neuromorphic processors
has brought on new opportunities for applying both Deep and Spiking Neural Network …

hls4ml: An open-source codesign workflow to empower scientific low-power machine learning devices

F Fahim, B Hawks, C Herwig, J Hirschauer… - arXiv preprint arXiv …, 2021 - arxiv.org
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient
devices and systems are extremely valuable across a broad range of application domains …

Neural architecture search survey: A hardware perspective

KT Chitty-Venkata, AK Somani - ACM Computing Surveys, 2022 - dl.acm.org
We review the problem of automating hardware-aware architectural design process of Deep
Neural Networks (DNNs). The field of Convolutional Neural Network (CNN) algorithm design …

Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors

CN Coelho, A Kuusela, S Li, H Zhuang… - Nature Machine …, 2021 - nature.com
Although the quest for more accurate solutions is pushing deep learning research towards
larger and more complex algorithms, edge devices demand efficient inference and therefore …

Deep neural network approximation for custom hardware: Where we've been, where we're going

E Wang, JJ Davis, R Zhao, HC Ng, X Niu… - ACM Computing …, 2019 - dl.acm.org
Deep neural networks have proven to be particularly effective in visual and audio
recognition tasks. Existing models tend to be computationally expensive and memory …

Fast convolutional neural networks on FPGAs with hls4ml

T Aarrestad, V Loncar, N Ghielmetti… - Machine Learning …, 2021 - iopscience.iop.org
We introduce an automated tool for deploying ultra low-latency, low-power deep neural
networks with convolutional layers on field-programmable gate arrays (FPGAs). By …