High-level synthesis design space exploration: Past, present, and future

BC Schafer, Z Wang - … on Computer-Aided Design of Integrated …, 2019 - ieeexplore.ieee.org
This article presents a survey of the different modern high-level synthesis (HLS) design
space exploration (DSE) techniques that have been proposed so far to automatically …

Modern development methods and tools for embedded reconfigurable systems: A survey

L Jóźwiak, N Nedjah, M Figueroa - Integration, 2010 - Elsevier
Heterogeneous reconfigurable systems provide drastically higher performance and lower
power consumption than traditional CPU-centric systems. Moreover, they do it at much lower …

Predicting whole-program locality through reuse distance analysis

C Ding, Y Zhong - Proceedings of the ACM SIGPLAN 2003 conference …, 2003 - dl.acm.org
Profiling can accurately analyze program behavior for select data inputs. We show that
profiling can also predict program locality for inputs other than profiled ones. Here locality is …

[图书][B] Automatic performance tuning of sparse matrix kernels

RW Vuduc - 2003 - search.proquest.com
This dissertation presents an automated system to generate highly efficient, platform-
adapted implementations of sparse matrix kernels. We show that conventional …

Polyhedral-based data reuse optimization for configurable computing

LN Pouchet, P Zhang, P Sadayappan… - Proceedings of the ACM …, 2013 - dl.acm.org
Many applications, such as medical imaging, generate intensive data traffic between the
FPGA and off-chip memory. Significant improvements in the execution time can be achieved …

Lin-analyzer: A high-level performance analysis tool for FPGA-based accelerators

G Zhong, A Prakash, Y Liang, T Mitra… - Proceedings of the 53rd …, 2016 - dl.acm.org
The increasing complexity of FPGA-based accelerators, coupled with time-to-market
pressure, makes high-level synthesis (HLS) an attractive solution to improve designer …

Program locality analysis using reuse distance

Y Zhong, X Shen, C Ding - ACM Transactions on Programming …, 2009 - dl.acm.org
On modern computer systems, the memory performance of an application depends on its
locality. For a single execution, locality-correlated measures like average miss rate or …

Optimizing compiler for the cell processor

AE Eichenbergert, K O'Brien, K O'Brien… - 14th International …, 2005 - ieeexplore.ieee.org
Developed for multimedia and game applications, as well as other numerically intensive
workloads, the CELL processor provides support both for highly parallel codes, which have …

Measurements of extremely low radioactivity levels in BOREXINO

C Arpesella, HO Back, M Balata, T Beau, G Bellini… - Astroparticle …, 2002 - Elsevier
The techniques researched, developed and applied towards the measurement of
radioisotope concentrations at ultra-low levels in the real-time solar neutrino experiment …

Optimizing for parallelism and data locality

K Kennedy, KS McKinley - … of the 6th international conference on …, 1992 - dl.acm.org
Previous research has used program transformation to introduce parallelism and to exploit
data locality. Unfortunately, these two objectives have usually been considered …