Scalehls: A new scalable high-level synthesis framework on multi-level intermediate representation

H Ye, C Hao, J Cheng, H Jeong… - … symposium on high …, 2022 - ieeexplore.ieee.org
High-level synthesis (HLS) has been widely adopted as it significantly improves the
hardware design productivity and enables efficient design space exploration (DSE). Existing …

Deep neural network model and FPGA accelerator co-design: Opportunities and challenges

C Hao, D Chen - 2018 14th IEEE International Conference on …, 2018 - ieeexplore.ieee.org
With an explosive growth of various neural network algorithms, their high performance
implementations on hardware platforms, such as GPUs and FPGAs, are becoming critical as …

Enabling design methodologies and future trends for edge AI: Specialization and codesign

C Hao, J Dotzel, J Xiong, L Benini, Z Zhang… - IEEE Design & …, 2021 - ieeexplore.ieee.org
This work is an introduction and a survey for the Special Issue on Machine Intelligence at the
Edge. The authors argue that workloads that were formerly performed in the cloud are …

Machine learning on FPGAs to face the IoT revolution

X Zhang, A Ramachandran, C Zhuge… - 2017 IEEE/ACM …, 2017 - ieeexplore.ieee.org
FPGAs have been rapidly adopted for acceleration of Deep Neural Networks (DNNs) with
improved latency and energy efficiency compared to CPU and GPU-based implementations …

gem5-salam: A system architecture for llvm-based accelerator modeling

S Rogers, J Slycord, M Baharani… - 2020 53rd Annual IEEE …, 2020 - ieeexplore.ieee.org
With the prevalence of hardware accelerators as an integral part of the modern systems on
chip (SoCs), the ability to quickly and accurately model accelerators within the system it …

HIDA: A Hierarchical Dataflow Compiler for High-Level Synthesis

H Ye, H Jun, D Chen - Proceedings of the 29th ACM International …, 2024 - dl.acm.org
Dataflow architectures are growing in popularity due to their potential to mitigate the
challenges posed by the memory wall inherent to the Von Neumann architecture. At the …

A parallel genetic algorithm with dispersion correction for HW/SW partitioning on multi-core CPU and many-core GPU

N Hou, F He, Y Zhou, Y Chen, X Yan - IEEE Access, 2017 - ieeexplore.ieee.org
In hardware/software (HW/SW) co-design, hardware/software partitioning is an essential
step in that it determines which components to be implemented in hardware and which ones …

A survey on partitioning models, solution algorithms and algorithm parallelization for hardware/software co-design

N Hou, X Yan, F He - Design Automation for Embedded Systems, 2019 - Springer
In electronic design automation, hardware/software co-design significantly reduces the time-
to-market and improves the performance of embedded systems. With the increasing scale of …

High-level synthesis for domain specific computing

H Ye, H Jun, J Yang, D Chen - … of the 2023 International Symposium on …, 2023 - dl.acm.org
This paper proposes a High-Level Synthesis (HLS) framework for domain-specific
computing. The framework contains three key components: 1) ScaleHLS, a multi-level HLS …

Early dse and automatic generation of coarse-grained merged accelerators

I Brumar, G Zacharopoulos, Y Yao, S Rama… - ACM Transactions on …, 2023 - dl.acm.org
Post-Moore's law area-constrained systems rely on accelerators to deliver performance
enhancements. Coarse-grained accelerators can offer substantial domain acceleration, but …