High-level synthesis for FPGAs: From prototyping to deployment

J Cong, B Liu, S Neuendorffer… - … on Computer-Aided …, 2011 - ieeexplore.ieee.org
Escalating system-on-chip design complexity is pushing the design community to raise the
level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of …

Charm: A composable heterogeneous accelerator-rich microprocessor

J Cong, MA Ghodrat, M Gill, B Grigorian… - Proceedings of the 2012 …, 2012 - dl.acm.org
This work discusses CHARM, a Composable Heterogeneous Accelerator-Rich
Microprocessor design that provides scalability, flexibility, and design reuse in the space of …

REAFUM: representative approximate frequent subgraph mining

R Li, W Wang - Proceedings of the 2015 SIAM International …, 2015 - SIAM
Noisy graph data and pattern variations are two thorny problems faced by mining frequent
subgraphs. Traditional exact-matching based methods, however, only generate patterns that …

Automatic enhanced CDFG generation based on runtime instrumentation

Z Yuan, Y Ma, J Bian, K Zhao - Proceedings of the 2013 IEEE …, 2013 - ieeexplore.ieee.org
Control and Data Flow Graph (CDFG) is a universal description of program behavior, which
is widely used in the co-design of software and hardware. The derivation of CDFG has been …

ReHLS: resource-aware program transformation workflow for high-level synthesis

A Lotfi, RK Gupta - 2017 IEEE International Conference on …, 2017 - ieeexplore.ieee.org
Despite considerable improvements in existing HLS tools, they still require designer
interventions to provide efficient synthesis results. This manual design space exploration …

Throughput constrained parallelism reduction in cyclo-static dataflow applications

S Carpov, L Cudennec, R Sirdey - Procedia Computer Science, 2013 - Elsevier
This paper deals with semantics-preserving parallelism reduction methods for cyclo-static
dataflow applications. Parallelism reduction is the process of equivalent actors fusioning …

Factorization approach to time-varying filter banks and wavelets

RA Gopinath - Proceedings of ICASSP'94. IEEE International …, 1994 - ieeexplore.ieee.org
A complete factorization of all optimal (in terms of quick transition) time-varying FIR unitary
filter bank tree topologies is obtained. This has applications in adaptive subband coding …

Compilation and Optimizations for Efficient Machine Learning on Embedded Systems

X Zhang, Y Chen, C Hao, S Huang, Y Li… - … Machine Learning for …, 2023 - Springer
Abstract Deep Neural Networks (DNNs) have achieved great success in a variety of
machine learning (ML) applications, delivering high-quality inferencing solutions in …

High-level synthesis of resource-shared microarchitectures from irregular complex c-code

B Liebig, A Koch - 2016 International Conference on Field …, 2016 - ieeexplore.ieee.org
Many high-level synthesis (HLS) tools aim at the hardware translation of input programs with
relatively short loop bodies, often with a very regular control flow. However, codes from …

[HTML][HTML] Subgraph isomorphism based intrinsic function reduction in decompilation

Y Liu, Y Zhao, L Zhang, K Liu - Journal of Software Engineering and …, 2016 - scirp.org
Program comprehension is one of the most important applications in decompilation. The
more abstract the decompilation result the better it is understood. Intrinsic function is …