VTA: an open hardware-software stack for deep learning

J Cong, J Lau, G Liu, S Neuendorffer, P Pan… - ACM Transactions on …, 2022 - dl.acm.org

The year 2011 marked an important transition for FPGA high-level synthesis (HLS), as it
went from prototyping to deployment. A decade later, in this article, we assess the progress …

被引用次数：139 相关文章所有 7 个版本

[PDF] sfu.ca

Programming and synthesis for software-defined FPGA acceleration: status and future prospects

YH Lai, E Ustun, S Xiang, Z Fang, H Rong… - ACM Transactions on …, 2021 - dl.acm.org

FPGA-based accelerators are increasingly popular across a broad range of applications,
because they offer massive parallelism, high energy efficiency, and great flexibility for …

被引用次数：48 相关文章所有 3 个版本

[PDF] arxiv.org

Gemmini: Enabling systematic deep-learning architecture evaluation via full-stack integration

H Genc, S Kim, A Amid, A Haj-Ali, V Iyer… - 2021 58th ACM/IEEE …, 2021 - ieeexplore.ieee.org

DNN accelerators are often developed and evaluated in isolation without considering the
cross-stack, system-level effects in real-world environments. This makes it difficult to …

被引用次数：264 相关文章所有 7 个版本

[PDF] arxiv.org

Scalehls: A new scalable high-level synthesis framework on multi-level intermediate representation

H Ye, C Hao, J Cheng, H Jeong… - … symposium on high …, 2022 - ieeexplore.ieee.org

High-level synthesis (HLS) has been widely adopted as it significantly improves the
hardware design productivity and enables efficient design space exploration (DSE). Existing …

被引用次数：105 相关文章所有 15 个版本

[PDF] acm.org

Tensorir: An abstraction for automatic tensorized program optimization

S Feng, B Hou, H Jin, W Lin, J Shao, R Lai… - Proceedings of the 28th …, 2023 - dl.acm.org

Deploying deep learning models on various devices has become an important topic. The
wave of hardware specialization brings a diverse set of acceleration primitives for multi …

被引用次数：70 相关文章所有 4 个版本

[PDF] nsf.gov

HeteroCL: A multi-paradigm programming infrastructure for software-defined reconfigurable computing

YH Lai, Y Chi, Y Hu, J Wang, CH Yu, Y Zhou… - Proceedings of the …, 2019 - dl.acm.org

With the pursuit of improving compute performance under strict power constraints, there is
an increasing need for deploying applications to heterogeneous hardware architectures with …

被引用次数：120 相关文章所有 7 个版本

[PDF] github.io

[PDF][PDF] Gemmini: An agile systolic array generator enabling systematic evaluations of deep-learning architectures

H Genc, A Haj-Ali, V Iyer, A Amid, H Mao… - arXiv preprint arXiv …, 2019 - alonamid.github.io

Advances in deep learning and neural networks have resulted in rapid development of
hardware accelerators that support them. A large majority of ASIC accelerators, however …

被引用次数：110 相关文章

Optimizing deep neural networks on intelligent edge accelerators via flexible-rate filter pruning

G Li, X Ma, X Wang, H Yue, J Li, L Liu, X Feng… - Journal of Systems …, 2022 - Elsevier

While deep learning has shown superior performance in various intelligent tasks, it is still a
challenging problem to deploy sophisticated models on resource-limited edge devices. Filter …

被引用次数：44 相关文章所有 2 个版本

[PDF] nsf.gov

Remote power attacks on the versatile tensor accelerator in multi-tenant FPGAs

S Tian, S Moini, A Wolnikowski… - 2021 IEEE 29th …, 2021 - ieeexplore.ieee.org

Architectural details of machine learning models are crucial pieces of intellectual property in
many applications. Revealing the structure or types of layers in a model can result in a leak …

被引用次数：51 相关文章所有 6 个版本

[PDF] arxiv.org

DNNVM: End-to-end compiler leveraging heterogeneous optimizations on FPGA-based CNN accelerators

Y Xing, S Liang, L Sui, X Jia, J Qiu, X Liu… - … on Computer-Aided …, 2019 - ieeexplore.ieee.org

The convolutional neural network (CNN) has become a state-of-the-art method for several
artificial intelligence domains in recent years. The increasingly complex CNN models are …

被引用次数：86 相关文章所有 3 个版本

高级搜索

QQ 群