The future of FPGA acceleration in datacenters and the cloud

C Bobda, JM Mbongue, P Chow, M Ewais… - ACM Transactions on …, 2022 - dl.acm.org
In this article, we survey existing academic and commercial efforts to provide Field-
Programmable Gate Array (FPGA) acceleration in datacenters and the cloud. The goal is a …

A survey on risc-v-based machine learning ecosystem

S Kalapothas, M Galetakis, G Flamis, F Plessas… - Information, 2023 - mdpi.com
In recent years, the advancements in specialized hardware architectures have supported the
industry and the research community to address the computation power needed for more …

Ansor: Generating {High-Performance} tensor programs for deep learning

L Zheng, C Jia, M Sun, Z Wu, CH Yu, A Haj-Ali… - … USENIX symposium on …, 2020 - usenix.org
High-performance tensor programs are crucial to guarantee efficient execution of deep
neural networks. However, obtaining performant tensor programs for different operators on …

Mix and match: A novel fpga-centric deep neural network quantization framework

SE Chang, Y Li, M Sun, R Shi, HKH So… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have achieved extraordinary performance in various
application domains. To support diverse DNN models, efficient implementations of DNN …

A tinyml platform for on-device continual learning with quantized latent replays

L Ravaglia, M Rusci, D Nadalini… - IEEE Journal on …, 2021 - ieeexplore.ieee.org
In the last few years, research and development on Deep Learning models & techniques for
ultra-low-power devices–in a word, TinyML–has mainly focused on a train-then-deploy …

Hardware acceleration of sparse and irregular tensor computations of ml models: A survey and insights

S Dave, R Baghdadi, T Nowatzki… - Proceedings of the …, 2021 - ieeexplore.ieee.org
Machine learning (ML) models are widely used in many important domains. For efficiently
processing these computational-and memory-intensive applications, tensors of these …

DNNExplorer: a framework for modeling and exploring a novel paradigm of FPGA-based DNN accelerator

X Zhang, H Ye, J Wang, Y Lin, J Xiong, W Hwu… - Proceedings of the 39th …, 2020 - dl.acm.org
Existing FPGA-based DNN accelerators typically fall into two design paradigms. Either they
adopt a generic reusable architecture to support different DNN networks but leave some …

Hasco: Towards agile hardware and software co-design for tensor computation

Q Xiao, S Zheng, B Wu, P Xu, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …

Remote power attacks on the versatile tensor accelerator in multi-tenant FPGAs

S Tian, S Moini, A Wolnikowski… - 2021 IEEE 29th …, 2021 - ieeexplore.ieee.org
Architectural details of machine learning models are crucial pieces of intellectual property in
many applications. Revealing the structure or types of layers in a model can result in a leak …

Pure tensor program rewriting via access patterns (representation pearl)

GH Smith, A Liu, S Lyubomirsky, S Davidson… - Proceedings of the 5th …, 2021 - dl.acm.org
Tensor kernels in machine learning (ML) often correspond to pure mathematical
expressions, making term rewriting an attractive strategy for optimization and mapping to …