A hardware–software blueprint for flexible deep learning specialization

C Bobda, JM Mbongue, P Chow, M Ewais… - ACM Transactions on …, 2022 - dl.acm.org

In this article, we survey existing academic and commercial efforts to provide Field-
Programmable Gate Array (FPGA) acceleration in datacenters and the cloud. The goal is a …

被引用次数：129 相关文章所有 6 个版本

[PDF] mdpi.com

A survey on risc-v-based machine learning ecosystem

S Kalapothas, M Galetakis, G Flamis, F Plessas… - Information, 2023 - mdpi.com

In recent years, the advancements in specialized hardware architectures have supported the
industry and the research community to address the computation power needed for more …

被引用次数：32 相关文章所有 5 个版本

[PDF] usenix.org

Ansor: Generating {High-Performance} tensor programs for deep learning

L Zheng, C Jia, M Sun, Z Wu, CH Yu, A Haj-Ali… - … USENIX symposium on …, 2020 - usenix.org

High-performance tensor programs are crucial to guarantee efficient execution of deep
neural networks. However, obtaining performant tensor programs for different operators on …

被引用次数：428 相关文章所有 16 个版本

[PDF] arxiv.org

Mix and match: A novel fpga-centric deep neural network quantization framework

SE Chang, Y Li, M Sun, R Shi, HKH So… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Deep Neural Networks (DNNs) have achieved extraordinary performance in various
application domains. To support diverse DNN models, efficient implementations of DNN …

被引用次数：117 相关文章所有 7 个版本

[PDF] arxiv.org

A tinyml platform for on-device continual learning with quantized latent replays

L Ravaglia, M Rusci, D Nadalini… - IEEE Journal on …, 2021 - ieeexplore.ieee.org

In the last few years, research and development on Deep Learning models & techniques for
ultra-low-power devices–in a word, TinyML–has mainly focused on a train-then-deploy …

被引用次数：86 相关文章所有 9 个版本

[PDF] ieee.org

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org

Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

被引用次数：101 相关文章所有 7 个版本

[PDF] arxiv.org

DNNExplorer: a framework for modeling and exploring a novel paradigm of FPGA-based DNN accelerator

X Zhang, H Ye, J Wang, Y Lin, J Xiong, W Hwu… - Proceedings of the 39th …, 2020 - dl.acm.org

Existing FPGA-based DNN accelerators typically fall into two design paradigms. Either they
adopt a generic reusable architecture to support different DNN networks but leave some …

被引用次数：96 相关文章所有 7 个版本

[PDF] arxiv.org

Hasco: Towards agile hardware and software co-design for tensor computation

Q Xiao, S Zheng, B Wu, P Xu, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …

被引用次数：75 相关文章所有 10 个版本

[PDF] nsf.gov

Remote power attacks on the versatile tensor accelerator in multi-tenant FPGAs

S Tian, S Moini, A Wolnikowski… - 2021 IEEE 29th …, 2021 - ieeexplore.ieee.org

Architectural details of machine learning models are crucial pieces of intellectual property in
many applications. Revealing the structure or types of layers in a model can result in a leak …

被引用次数：51 相关文章所有 6 个版本

[PDF] arxiv.org

Pure tensor program rewriting via access patterns (representation pearl)

GH Smith, A Liu, S Lyubomirsky, S Davidson… - Proceedings of the 5th …, 2021 - dl.acm.org

Tensor kernels in machine learning (ML) often correspond to pure mathematical
expressions, making term rewriting an attractive strategy for optimization and mapping to …

被引用次数：40 相关文章所有 4 个版本

高级搜索

QQ 群