A methodology for speeding up fast fourier transform focusing on memory architecture utilization

V Kelefouras, K Djemame, G Keramidas… - International Journal of …, 2022 - Springer

Reducing the number of data accesses in memory hierarchy is of paramount importance on
modern computer systems. One of the key optimizations addressing this problem is loop …

被引用次数：5 相关文章所有 6 个版本

[PDF] academia.edu

The fastest Fourier transform in the south

AM Blake, IH Witten, MJ Cree - IEEE transactions on signal …, 2013 - ieeexplore.ieee.org

This paper describes FFTS, a discrete Fourier transform (DFT) library that achieves state-of-
the-art performance using a new cache-oblivious algorithm implemented with run-time …

被引用次数：24 相关文章所有 7 个版本

[PDF] jst.go.jp

An ultra-long FFT architecture implemented in a reconfigurable application specified processor

F Han, L Li, K Wang, F Feng, H Pan, B Zhang… - IEICE Electronics …, 2016 - jstage.jst.go.jp

This paper presents an efficient architecture for performing 128 points to 1M points Fast
Fourier Transformation (FFT) based on mixed radix-2/4/8 butterfly unit. The proposed FFT …

被引用次数：12 相关文章所有 7 个版本

[PDF] springer.com

Instruction scheduling heuristic for an efficient FFT in VLIW processors with balanced resource usage

M Bahtat, S Belkouch, P Elleaume, P Le Gall - EURASIP Journal on …, 2016 - Springer

The fast Fourier transform (FFT) is perhaps today's most ubiquitous algorithm used with
digital data; hence, it is still being studied extensively. Besides the benefit of reducing the …

被引用次数：11 相关文章所有 10 个版本

[PDF] waikato.ac.nz

Computing the fast Fourier transform on SIMD microprocessors

AM Blake - 2012 - researchcommons.waikato.ac.nz

This thesis describes how to compute the fast Fourier transform (FFT) of a power-of-two
length signal on single-instruction, multiple-data (SIMD) microprocessors faster than or very …

被引用次数：14 相关文章

A methodology for speeding up edge and line detection algorithms focusing on memory architecture utilization

V Kelefouras, A Kritikakou, C Goutis - The Journal of Supercomputing, 2014 - Springer

In this paper, a new methodology for speeding up edge and line detection algorithms is
presented, achieving improved performance over the state of the art software library …

被引用次数：10 相关文章所有 8 个版本

[PDF] plymouth.ac.uk

An analytical model for loop tiling transformation

V Kelefouras, K Djemame, G Keramidas… - … Conference on Embedded …, 2021 - Springer

Loop tiling is a well-known loop transformation that enhances data locality in memory
hierarchy. In this paper, we initially reveal two important inefficiencies of current analytical …

被引用次数：2 相关文章所有 5 个版本

[PDF] academia.edu

A methodology for speeding up mvm for regular, toeplitz and bisymmetric toeplitz matrices

VI Kelefouras, AS Kritikakou, K Siourounis… - Journal of Signal …, 2014 - Springer

Abstract The Matrix Vector Multiplication algorithm is an important kernel in most varied
domains and application areas and the performance of its implementations highly depends …

被引用次数：5 相关文章所有 8 个版本

[PDF] shu.ac.uk

A methodology for speeding up loop kernels by exploiting the software information and the memory architecture

V Kelefouras, A Kritikakou, C Goutis - Computer Languages, Systems & …, 2015 - Elsevier

It is well-known that today׳ s compilers and state of the art libraries have three major
drawbacks. First, the compiler sub-problems are optimized separately; this is not efficient …

被引用次数：3 相关文章所有 6 个版本

[PDF] hal.science

Adaptation du calcul de la Transformée de Fourier Rapide sur une architecture mixte CPU/GPU intégrée

MA Bergach - 2015 - inria.hal.science

Les architectures multi-cœurs Intel Core (IvyBridge, Haswell,...) contiennent à la fois des
cœurs CPU généralistes (4), mais aussi des cœurs dédiés GPU embarqués sur cette même …

被引用次数：2 相关文章所有 9 个版本

高级搜索

QQ 群