Memory partitioning for multidimensional arrays in high-level synthesis

Y Wang, P Li, P Zhang, C Zhang, J Cong - Proceedings of the 50th …, 2013 - dl.acm.org
Memory partitioning is widely adopted to efficiently increase the memory bandwidth by using
multiple memory banks and reducing data access conflict. Previous methods for memory …

OMNI: A framework for integrating hardware and software optimizations for sparse CNNs

Y Liang, L Lu, J Xie - … on Computer-Aided Design of Integrated …, 2020 - ieeexplore.ieee.org
Convolution neural networks (CNNs) as one of today's main flavor of deep learning
techniques dominate in various image recognition tasks. As the model size of modern CNNs …

An optimal microarchitecture for stencil computation acceleration based on non-uniform partitioning of data reuse buffers

J Cong, P Li, B Xiao, P Zhang - Proceedings of the 51st annual design …, 2014 - dl.acm.org
High-level synthesis (HLS) tools have made significant progress in compiling high-level
descriptions of computation into highly pipelined register-transfer level (RTL) specifications …

A new approach to automatic memory banking using trace-based address mining

Y Zhou, KM Al-Hawaj, Z Zhang - Proceedings of the 2017 ACM/SIGDA …, 2017 - dl.acm.org
Recent years have seen an increased deployment of FPGAs as programmable accelerators
for improving the performance and energy efficiency of compute-intensive applications. A …

An exploration framework for efficient high-level synthesis of support vector machines: Case study on ecg arrhythmia detection for xilinx zynq soc

V Tsoutsouras, K Koliogeorgi, S Xydis… - Journal of Signal …, 2017 - Springer
Abstract In recent years, Support Vector Machine (SVM) classifiers have played a crucial
role in providing data fusion and high accuracy classification solutions for various, complex …

Efficient memory partitioning for parallel data access in multidimensional arrays

C Meng, S Yin, P Ouyang, L Liu, S Wei - Proceedings of the 52nd Annual …, 2015 - dl.acm.org
Memory bandwidth bottlenecks severely restrict parallel access of data from memory arrays.
To increase bandwidth, memory partitioning algorithms have been proposed to access …

An integrated and automated memory optimization flow for FPGA behavioral synthesis

Y Wang, P Zhang, X Cheng… - 17th Asia and South …, 2012 - ieeexplore.ieee.org
Behavioral synthesis tools have made significant progress in compiling high-level programs
into register-transfer level (RTL) specifications. But manually rewriting code is still necessary …

High-level synthesis for semi-global matching: Is the juice worth the squeeze?

A Qamar, FB Muslim, F Gregoretti, L Lavagno… - IEEE …, 2016 - ieeexplore.ieee.org
High-level synthesis (HLS)-based design methodologies are extremely viable for industries
that are sensitive to production costs. In order to have competitive advantage, the ability to …

Efficient memory partitioning for parallel data access via data reuse

J Su, F Yang, X Zeng, D Zhou - Proceedings of the 2016 ACM/SIGDA …, 2016 - dl.acm.org
In this paper, we propose an efficient memory partitioning algorithm for parallel data access
via data reuse. We found that for most of the applications in image and video processing, a …

Automated generation of banked memory architectures in the high-level synthesis of multi-threaded software

YT Chen, JH Anderson - 2017 27th International Conference on …, 2017 - ieeexplore.ieee.org
Some modern high-level synthesis (HLS) tools [1] permit the synthesis of multi-threaded
software into parallel hardware, where concurrent software threads are realized as …