A highly configurable hardware/Software stack for DNN inference acceleration

H Esmaeilzadeh, S Ghodrati, A Kahng, JK Kim… - ACM Transactions on …, 2024 - dl.acm.org

Parameterizable machine learning (ML) accelerators are the product of recent
breakthroughs in ML. To fully enable their design space exploration (DSE), we propose a …

被引用次数：4 相关文章所有 3 个版本

[PDF] acm.org

Physically accurate learning-based performance prediction of hardware-accelerated ml algorithms

H Esmaeilzadeh, S Ghodrati, AB Kahng… - Proceedings of the …, 2022 - dl.acm.org

Parameterizable ML accelerators are the product of recent breakthroughs in machine
learning (ML). To fully enable the design space exploration, we propose a physical-design …

被引用次数：8 相关文章所有 13 个版本

[PDF] arxiv.org

ARCO: Adaptive Multi-Agent Reinforcement Learning-Based Hardware/Software Co-Optimization Compiler for Improved Performance in DNN Accelerator Design

A Fayyazi, M Kamal, M Pedram - arXiv preprint arXiv:2407.08192, 2024 - arxiv.org

This paper presents ARCO, an adaptive Multi-Agent Reinforcement Learning (MARL)-based
co-optimizing compilation framework designed to enhance the efficiency of mapping …

被引用次数：1 相关文章所有 2 个版本

[PDF] acm.org

Reusing GEMM hardware for efficient execution of depthwise separable convolution on ASIC-based DNN accelerators

SD Manasi, S Banerjee, A Davare, AA Sorokin… - Proceedings of the 28th …, 2023 - dl.acm.org

Deep learning (DL) accelerators are optimized for standard convolution. However,
lightweight convolutional neural networks (CNNs) use depthwise convolution (DwC) in key …

被引用次数：6 相关文章所有 9 个版本

[PDF] arxiv.org

Performance Analysis of DNN Inference/Training with Convolution and non-Convolution Operations

H Esmaeilzadeh, S Ghodrati, AB Kahng… - arXiv preprint arXiv …, 2023 - arxiv.org

Today's performance analysis frameworks for deep learning accelerators suffer from two
significant limitations. First, although modern convolutional neural network (CNNs) consist of …

被引用次数：2 相关文章所有 2 个版本

Performance analysis of CNN inference/training with convolution and non-convolution operations on ASIC accelerators

H Esmaeilzadeh, S Ghodrati, AB Kahng… - ACM Transactions on …, 2024 - dl.acm.org

Today's performance analysis frameworks for deep learning accelerators suffer from two
significant limitations. First, although modern convolutional neural networks (CNNs) consist …

DNN Model Theft Through Trojan Side-Channel on Edge FPGA Accelerator

S Chandrasekar, SK Lam, S Thambipillai - International Symposium on …, 2023 - Springer

In this paper, we present a novel hardware trojan assisted side-channel attack to reverse
engineer DNN architectures on edge FPGA accelerators. In particular, our attack targets the …

被引用次数：3 相关文章所有 2 个版本

Optimization of the Versatile Tensor Accelerator (VTA) Load Module in a Time-Triggered Memory Access

AM Ezekiel, D Onwuchekwa… - 2023 26th Euromicro …, 2023 - ieeexplore.ieee.org

Embedded systems powered by artificial intelligence (AI) are widely employed in diverse
domains. However, the lack of inherent predictability in existing AI accelerators poses …

被引用次数：5 相关文章所有 2 个版本

Exploration for Efficient Depthwise Separable Convolution Networks Deployment on FPGA

Z Huang, A Qie, C Zhang, J Yang… - 2024 IEEE 6th …, 2024 - ieeexplore.ieee.org

Depthwise Separable Convolution (DSC) has become the key structure in lightweight
convolutional neural networks. However, the tight connection between network structure and …

Software-driven Design for Domain-specific Compute

DA Kirkpatrick - Proceedings of the 2023 International Symposium on …, 2023 - dl.acm.org

The end of Dennard scaling has created a focus on advancing domain-specific computing;
we are seeing a renaissance of accelerating compute problems through specialization, with …

高级搜索

QQ 群