Leveraging the vta-tvm hardware-software stack for fpga acceleration of 8-bit resnet-18 inference

AMC Deiana, N Tran, J Agar, M Blott… - Frontiers in big …, 2022 - frontiersin.org

In this community review report, we discuss applications and techniques for fast machine
learning (ML) in science—the concept of integrating powerful ML methods into the real-time …

被引用次数：66 相关文章所有 27 个版本

Vision-based autonomous bolt-looseness detection method for splice connections: Design, lab-scale evaluation, and field application

TC Huynh - Automation in Construction, 2021 - Elsevier

This study presents a novel autonomous vision-based bolt-looseness detection method for
splice bolted connections. The method is sequentially designed with a Faster regional …

被引用次数：83 相关文章

[PDF] fredrikbk.com

Aha: An agile approach to the design of coarse-grained reconfigurable accelerators and compilers

K Koul, J Melchert, K Sreedhar, L Truong… - ACM Transactions on …, 2023 - dl.acm.org

With the slowing of Moore's law, computer architects have turned to domain-specific
hardware specialization to continue improving the performance and efficiency of computing …

被引用次数：22 相关文章所有 3 个版本

[PDF] acm.org Full View

Marvel: A data-centric approach for mapping deep learning operators on spatial accelerators

P Chatarasi, H Kwon, A Parashar, M Pellauer… - ACM Transactions on …, 2021 - dl.acm.org

A spatial accelerator's efficiency depends heavily on both its mapper and cost models to
generate optimized mappings for various operators of DNN models. However, existing cost …

被引用次数：56 相关文章所有 7 个版本

[PDF] acm.org Full View

Unified buffer: Compiling image processing and machine learning applications to push-memory accelerators

Q Liu, J Setter, D Huff, M Strange, K Feng… - ACM Transactions on …, 2023 - dl.acm.org

Image processing and machine learning applications benefit tremendously from hardware
acceleration. Existing compilers target either FPGAs, which sacrifice power and performance …

被引用次数：14 相关文章所有 3 个版本

[PDF] acm.org Full View

An Efficient Hybrid Deep Learning Accelerator for Compact and Heterogeneous CNNs

F Qararyah, MW Azhar, P Trancoso - ACM Transactions on Architecture …, 2024 - dl.acm.org

Resource-efficient Convolutional Neural Networks (CNNs) are gaining more attention.
These CNNs have relatively low computational and memory requirements. A common …

被引用次数：3 相关文章

[HTML] sciencedirect.com

[HTML][HTML] Quantune: Post-training quantization of convolutional neural networks using extreme gradient boosting for fast deployment

J Lee, M Yu, Y Kwon, T Kim - Future Generation Computer Systems, 2022 - Elsevier

To adopt convolutional neural networks (CNN) for a range of resource-constrained targets, it
is necessary to compress the CNN models by performing quantization, whereby precision …

被引用次数：20 相关文章所有 4 个版本

[PDF] stanford.edu

Tensorflow to cloud FPGAs: Tradeoffs for accelerating deep neural networks

S Hadjis, K Olukotun - 2019 29th International Conference on …, 2019 - ieeexplore.ieee.org

We present the first open-source TensorFlow to FPGA tool capable of running state-of-the-
art DNNs. Running TensorFlow on the Amazon cloud FPGA instances, we provide …

被引用次数：28 相关文章所有 5 个版本

Fibha: fixed budget hybrid CNN accelerator

F Qararyah, MW Azhar… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org

Seeking the “sweet spot” in the accuracy-efficiency trade-off is increasing the heterogeneity
of state-of-the-art Convolutional Neural Networks (CNNs). Such CNN models exhibit …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

Transparent compiler and runtime specializations for accelerating managed languages on fpgas

M Papadimitriou, J Fumero, A Stratikopoulos… - arXiv preprint arXiv …, 2020 - arxiv.org

In recent years, heterogeneous computing has emerged as the vital way to increase
computers? performance and energy efficiency by combining diverse hardware devices …

被引用次数：10 相关文章所有 5 个版本

高级搜索

QQ 群