Tarragon: a programming model for latency-hiding scientific computations

JM Wozniak, TG Armstrong, M Wilde… - 2013 13th IEEE/ACM …, 2013 - ieeexplore.ieee.org

Many scientific applications are conceptually built up from independent component tasks as
a parameter study, optimization, or other search. Large batches of these tasks may be …

被引用次数：210 相关文章所有 16 个版本

[PDF] researchgate.net

Recomputing coverage information to assist regression testing

PK Chittimalli, MJ Harrold - IEEE Transactions on Software …, 2009 - ieeexplore.ieee.org

This paper presents a technique that leverages an existing regression test selection
algorithm to compute accurate, updated coverage data on a version of the software, P i+ 1 …

被引用次数：124 相关文章所有 10 个版本

[PDF] academia.edu

A study on communication issues for systems-on-chip

CA Zeferino, ME Kreutz, L Carro… - … . 15th Symposium on …, 2002 - ieeexplore.ieee.org

Present days cores composing a system-on-chip might be interconnected by means of both
dedicated channels or shared buses. Nevertheless, future systems will have strong …

被引用次数：139 相关文章所有 6 个版本

[PDF] academia.edu

Swift/T: Scalable data flow programming for many-task applications

JM Wozniak, TG Armstrong, M Wilde, DS Katz… - Proceedings of the 18th …, 2013 - dl.acm.org

Swift/T: Scalable Data Flow Programming for Many-Task Applications Page 1 Swift/T: Scalable
Data Flow Programming for Many-Task Applications Justin M. Wozniak Argonne National …

被引用次数：70 相关文章所有 12 个版本

[PDF] ucsd.edu

Bamboo--Translating MPI applications to a latency-tolerant, data-driven form

T Nguyen, P Cicotti, E Bylaska… - SC'12: Proceedings …, 2012 - ieeexplore.ieee.org

We present Bamboo, a custom source-to-source translator that transforms MPI C source into
a data-driven form that automatically overlaps communication with available computation …

被引用次数：41 相关文章所有 11 个版本

[图书][B] Programming models for parallel computing

P Balaji - 2015 - books.google.com

An overview of the most prominent contemporary parallel processing programming models,
written in a unique tutorial style. With the coming of the parallel computing era, computer …

被引用次数：23 相关文章所有 8 个版本

Data movement in data-intensive high performance computing

P Cicotti, S Oral, G Kestor, R Gioiosa, S Strande… - Conquering Big Data …, 2016 - Springer

The cost of executing a floating point operation has been decreasing for decades at a much
higher rate than that of moving data. Bandwidth and latency, two key metrics that determine …

被引用次数：10 相关文章所有 4 个版本

[PDF] google.com

Perilla: Metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement

T Nguyen, D Unat, W Zhang, A Almgren… - SC'16: Proceedings …, 2016 - ieeexplore.ieee.org

Hardware architecture is increasingly complex, urging the development of asynchronous
runtime systems with advance resource and locality management supports. However, these …

被引用次数：8 相关文章所有 4 个版本

[PDF] sciencedirect.com

Automatic translation of MPI source into a latency-tolerant, data-driven form

T Nguyen, P Cicotti, E Bylaska, D Quinlan… - Journal of Parallel and …, 2017 - Elsevier

Hiding communication behind useful computation is an important performance programming
technique but remains an inscrutable programming exercise even for the expert. We present …

被引用次数：6 相关文章所有 7 个版本

[PDF] netlib.org

POSTER: Utilizing dataflow-based execution for coupled cluster methods

H McCraw, A Danalis, T Herault… - 2014 IEEE …, 2014 - ieeexplore.ieee.org

Computational chemistry comprises one of the driving forces of High Performance
Computing. In particular, many-body methods, such as Coupled Cluster methods (CC)[1] of …

被引用次数：7 相关文章所有 4 个版本

高级搜索

QQ 群