DAGuE: A generic distributed DAG engine for high performance computing

M Rocklin - SciPy, 2015 - conference.scipy.org.s3.amazonaws …

Dask enables parallel and out-of-core computation. We couple blocked algorithms with
dynamic and memory aware task scheduling to achieve a parallel and out-of-core NumPy …

被引用次数：900 相关文章所有 5 个版本

[PDF] arxiv.org

Wukong: A scalable and locality-enhanced framework for serverless parallel computing

B Carver, J Zhang, A Wang, A Anwar, P Wu… - Proceedings of the 11th …, 2020 - dl.acm.org

Executing complex, burst-parallel, directed acyclic graph (DAG) jobs poses a major
challenge for serverless execution frameworks, which will need to rapidly scale and …

被引用次数：138 相关文章所有 11 个版本

[PDF] acm.org

Algebraic methods for interactive proof systems

C Lund, L Fortnow, H Karloff, N Nisan - Journal of the ACM (JACM), 1992 - dl.acm.org

A new algebraic technique for the construction of interactive proof systems is presented. Our
technique is used to prove that every language in the polynomial-time hierarchy has an …

被引用次数：1175 相关文章所有 23 个版本

[PDF] acm.org

Serverless linear algebra

V Shankar, K Krauth, K Vodrahalli, Q Pu… - Proceedings of the 11th …, 2020 - dl.acm.org

Datacenter disaggregation provides numerous benefits to both the datacenter operator and
the application designer. However switching from the server-centric model to a …

被引用次数：111 相关文章所有 4 个版本

[PDF] hal.science

Xkaapi: A runtime system for data-flow task programming on heterogeneous architectures

T Gautier, JVF Lima, N Maillard… - 2013 IEEE 27th …, 2013 - ieeexplore.ieee.org

Most recent HPC platforms have heterogeneous nodes composed of multi-core CPUs and
accelerators, like GPUs. Programming such nodes is typically based on a combination of …

被引用次数：276 相关文章所有 21 个版本

[PDF] arxiv.org

An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling

P Ghysels, XS Li, FH Rouet, S Williams… - SIAM Journal on Scientific …, 2016 - SIAM

We present a sparse linear system solver that is based on a multifrontal variant of Gaussian
elimination and exploits low-rank approximation of the resulting dense frontal matrices. We …

被引用次数：182 相关文章所有 14 个版本

[PDF] researchgate.net

Swift/t: Large-scale application composition via distributed-memory dataflow processing

JM Wozniak, TG Armstrong, M Wilde… - 2013 13th IEEE/ACM …, 2013 - ieeexplore.ieee.org

Many scientific applications are conceptually built up from independent component tasks as
a parameter study, optimization, or other search. Large batches of these tasks may be …

被引用次数：210 相关文章所有 16 个版本

[PDF] utk.edu

The singular value decomposition: Anatomy of optimizing an algorithm for extreme scale

J Dongarra, M Gates, A Haidar, J Kurzak, P Luszczek… - SIAM review, 2018 - SIAM

The computation of the singular value decomposition, or SVD, has a long history with many
improvements over the years, both in its implementations and algorithmically. Here, we …

被引用次数：115 相关文章所有 7 个版本

[PDF] academia.edu

Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA

G Bosilca, A Bouteiller, A Danalis… - … on Parallel and …, 2011 - ieeexplore.ieee.org

We present a method for developing dense linear algebra algorithms that seamlessly scales
to thousands of cores. It can be done with our project called DPLASMA (Distributed …

被引用次数：205 相关文章所有 20 个版本

[PDF] hal.science

Achieving high performance on supercomputers with a sequential task-based programming model

E Agullo, O Aumage, M Faverge… - … on Parallel and …, 2017 - ieeexplore.ieee.org

The emergence of accelerators as standard computing resources on supercomputers and
the subsequent architectural complexity increase revived the need for high-level parallel …

被引用次数：126 相关文章所有 12 个版本

高级搜索

QQ 群