- 学术资源搜索

Effective automatic parallelization of stencil computations

S Krishnamoorthy, M Baskaran, U Bondhugula… - ACM sigplan …, 2007 - dl.acm.org

Performance optimization of stencil computations has been widely studied in the literature,
since they occur in many computationally intensive scientific and engineering applications …

被引用次数：324 相关文章所有 14 个版本

[PDF] ucsd.edu

Optimizing compiler for the cell processor

AE Eichenbergert, K O'Brien, K O'Brien… - 14th International …, 2005 - ieeexplore.ieee.org

Developed for multimedia and game applications, as well as other numerically intensive
workloads, the CELL processor provides support both for highly parallel codes, which have …

被引用次数：260 相关文章所有 23 个版本

[PDF] scarpaz.com

The Jrpm system for dynamically parallelizing Java programs

MK Chen, K Olukotun - Proceedings of the 30th annual international …, 2003 - dl.acm.org

We describe the Java runtime parallelizing machine (Jrpm), a complete system for
parallelizing sequential programs automatically. Jrpm is based on a chip multiprocessor …

被引用次数：198 相关文章所有 15 个版本

[PDF] researchgate.net

Component based simulation modeling with Simkit

A Buss - Proceedings of the Winter Simulation Conference, 2002 - ieeexplore.ieee.org

This paper demonstrates how to use Simkit to create discrete event simulation models using
a component framework. The component framework is based on a listener design pattern …

被引用次数：135 相关文章所有 7 个版本

Auditory distance perception by translating observers

JM Speigle, JM Loomis - … of 1993 IEEE Research Properties in …, 1993 - ieeexplore.ieee.org

The authors consider auditory distance perception of a moving observer and its relevance
for the perception of stationary and moving sources. They begin with a review of some of the …

被引用次数：99 相关文章所有 4 个版本

[PDF] tum.de

Toast: A heterogeneous memory management system

M Bailleu, D Stavrakakis, R Rocha… - Proceedings of the …, 2024 - dl.acm.org

Modern applications employ several heterogeneous memory types for improved
performance, security, and reliability. To manage them, programmers must currently digress …

被引用次数：1 相关文章所有 3 个版本

[PDF] academia.edu

[PDF][PDF] Implementation of NAS parallel benchmarks in high performance fortran

M Frumkin, H Jin, J Yan - NAS Techinical Report NAS-98-009, 1998 - academia.edu

We present an HPF implementation of BT, SP, LU, FT, CG and MG of the NPB2. 3-serial
benchmark set. The implementation is based on HPF performance model of the benchmark …

被引用次数：85 相关文章所有 6 个版本

[PDF] scarpaz.com

TEST: a tracer for extracting speculative threads

M Chen, K Olukotun - International Symposium on Code …, 2003 - ieeexplore.ieee.org

Thread-level speculation (TLS) allows sequential programs to be arbitrarily decomposed
into threads that can be safely executed in parallel. A key challenge for TLS processors is …

被引用次数：83 相关文章所有 13 个版本

[PDF] rice.edu

Increasing temporal locality with skewing and recursive blocking

G Jin, J Mellor-Crummey, R Fowler - Proceedings of the 2001 ACM/IEEE …, 2001 - dl.acm.org

We present a strategy, called recursive prismatic time skewing, that increase temporal reuse
at all memory hierarchy levels, thus improving the performance of scientific codes that use …

被引用次数：59 相关文章所有 18 个版本

[PDF] acm.org

Automatic data and computation decomposition on distributed memory parallel computers

P Lee, ZM Kedem - ACM Transactions on Programming Languages and …, 2002 - dl.acm.org

To exploit parallelism on shared memory parallel computers (SMPCs), it is natural to focus
on decomposing the computation (mainly by distributing the iterations of the nested Do …

被引用次数：52 相关文章所有 14 个版本

高级搜索

QQ 群