AutoDNNchip: An automated DNN chip predictor and builder for both FPGAs and ASICs

P Xu, X Zhang, C Hao, Y Zhao, Y Zhang… - Proceedings of the …, 2020 - dl.acm.org
Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a growing demand for
domain-specific hardware accelerators (ie, DNN chips). However, designing DNN chips is …

Bambu: A modular framework for the high level synthesis of memory-intensive applications

C Pilato, F Ferrandi - 2013 23rd International conference on …, 2013 - ieeexplore.ieee.org
This paper presents bambu, a modular framework for research on high-level synthesis
currently under development at Politecnico di Milano. It can accept most of C constructs …

HybridDNN: A framework for high-performance hybrid DNN accelerator design and implementation

H Ye, X Zhang, Z Huang, G Chen… - 2020 57th ACM/IEEE …, 2020 - ieeexplore.ieee.org
To speedup Deep Neural Networks (DNN) accelerator design and enable effective
implementation, we propose HybridDNN, a framework for building high-performance hybrid …

FCUDA: Enabling efficient compilation of CUDA kernels onto FPGAs

A Papakonstantinou, K Gururaj… - 2009 IEEE 7th …, 2009 - ieeexplore.ieee.org
As growing power dissipation and thermal effects disrupted the rising clock frequency trend
and threatened to annul Moore's law, the computing industry has switched its route to higher …

An efficient and versatile scheduling algorithm based on SDC formulation

J Cong, Z Zhang - Proceedings of the 43rd annual Design Automation …, 2006 - dl.acm.org
Scheduling plays a central role in the behavioral synthesis process, which automatically
compiles high-level specifications into optimized hardware implementations. However, most …

New solutions on LLM acceleration, optimization, and application

Y Huang, LJ Wan, H Ye, M Jha, J Wang, Y Li… - Proceedings of the 61st …, 2024 - dl.acm.org
Large Language Models (LLMs) have revolutionized a wide range of applications with their
strong human-like understanding and creativity. Due to the continuously growing model size …

Design space exploration of multiple loops on FPGAs using high level synthesis

G Zhong, V Venkataramani, Y Liang… - 2014 IEEE 32nd …, 2014 - ieeexplore.ieee.org
Real-world applications such as image processing, signal processing, and others often
contain a sequence of computation intensive kernels, each represented in the form of a …

Auto-NBA: Efficient and effective search over the joint space of networks, bitwidths, and accelerators

Y Fu, Y Zhang, Y Zhang, D Cox… - … Conference on Machine …, 2021 - proceedings.mlr.press
While maximizing deep neural networks'(DNNs') acceleration efficiency requires a joint
search/design of three different yet highly coupled aspects, including the networks, bitwidths …

An empirical evaluation of high-level synthesis languages and tools for database acceleration

O Arcas-Abella, G Ndu, N Sonmez… - … Conference on Field …, 2014 - ieeexplore.ieee.org
High Level Synthesis (HLS) languages and tools are emerging as the most promising
technique to make FPGAs more accessible to software developers. Nevertheless, picking …

Fast and effective placement and routing directed high-level synthesis for FPGAs

H Zheng, ST Gurumani, K Rupnow… - Proceedings of the 2014 …, 2014 - dl.acm.org
Achievable frequency (fmax) is a widely used input constraint for designs targeting Field-
Programmable Gate Arrays (FPGA), because of its impact on design latency and throughput …