Stonne: Enabling cycle-level microarchitectural simulation for dnn inference accelerators

R Xu, S Ma, Y Guo, D Li - ACM Computing Surveys, 2023 - dl.acm.org

In recent years, it has been witnessed that the systolic array is a successful architecture for
DNN hardware accelerators. However, the design of systolic arrays also encountered many …

被引用次数：36 相关文章所有 2 个版本

[PDF] arxiv.org

Sparseloop: An analytical approach to sparse tensor accelerator modeling

YN Wu, PA Tsai, A Parashar, V Sze… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

In recent years, many accelerators have been proposed to efficiently process sparse tensor
algebra applications (eg, sparse neural networks). However, these proposals are single …

被引用次数：60 相关文章所有 10 个版本

[PDF] arxiv.org

Flexagon: A multi-dataflow sparse-sparse matrix multiplication accelerator for efficient dnn processing

F Muñoz-Martínez, R Garg, M Pellauer… - Proceedings of the 28th …, 2023 - dl.acm.org

Sparsity is a growing trend in modern DNN models. Existing Sparse-Sparse Matrix
Multiplication (SpMSpM) accelerators are tailored to a particular SpMSpM dataflow (ie, Inner …

被引用次数：36 相关文章所有 4 个版本

[PDF] arxiv.org

Teaal: A declarative framework for modeling sparse tensor accelerators

N Nayak, TO Odemuyiwa, S Ugare, C Fletcher… - Proceedings of the 56th …, 2023 - dl.acm.org

Over the past few years, the explosion in sparse tensor algebra workloads has led to a
corresponding rise in domain-specific accelerators to service them. Due to the irregularity …

被引用次数：16 相关文章所有 8 个版本

[PDF] nsf.gov

Appcip: Energy-efficient approximate convolution-in-pixel scheme for neural network acceleration

S Tabrizchi, A Nezhadi, S Angizi… - IEEE Journal on …, 2023 - ieeexplore.ieee.org

Nowadays, always-on intelligent and self-powered visual perception systems have gained
considerable attention and are widely used. However, capturing data and analyzing it via a …

被引用次数：27 相关文章所有 4 个版本

[PDF] arxiv.org

Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads

H Fan, SI Venieris, A Kouris, N Lane - … of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org

Running multiple deep neural networks (DNNs) in parallel has become an emerging
workload in both edge devices, such as mobile phones where multiple tasks serve a single …

被引用次数：6 相关文章所有 6 个版本

[PDF] ieee.org

MR-PIPA: An Integrated Multilevel RRAM (HfO_x)-Based Processing-In-Pixel Accelerator

M Abedin, A Roohi, M Liehr, N Cady… - IEEE Journal on …, 2022 - ieeexplore.ieee.org

This work paves the way to realize a processing-in-pixel (PIP) accelerator based on a
multilevel HfOx resistive random access memory (RRAM) as a flexible, energy-efficient, and …

被引用次数：26 相关文章所有 6 个版本

[PDF] umich.edu

Taskfusion: An efficient transfer learning architecture with dual delta sparsity for multi-task natural language processing

Z Fan, Q Zhang, P Abillama, S Shoouri, C Lee… - Proceedings of the 50th …, 2023 - dl.acm.org

The combination of pre-trained models and task-specific fine-tuning schemes, such as
BERT, has achieved great success in various natural language processing (NLP) tasks …

被引用次数：10 相关文章所有 2 个版本

[PDF] arxiv.org

Secda: Efficient hardware/software co-design of fpga-based dnn accelerators for edge inference

J Haris, P Gibson, J Cano, NB Agostini… - 2021 IEEE 33rd …, 2021 - ieeexplore.ieee.org

Edge computing devices inherently face tight resource constraints, which is especially
apparent when deploying Deep Neural Networks (DNN) with high memory and compute …

被引用次数：27 相关文章所有 11 个版本

Zero and narrow-width value-aware compression for quantized convolutional neural networks

M Jang, J Kim, H Nam, S Kim - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Convolutional neural networks are normally used in systems with dedicated neural
processing units for CNN-related computations. For high performance and low hardware …

被引用次数：5 相关文章所有 4 个版本

高级搜索

QQ 群