OmpSs-2@ Cluster: Distributed memory execution of nested OpenMP-style tasks

JA Mena, O Shaaban, V Lopez, M Garcia… - 51st International …, 2022 - paul-carpenter.org

Abstract Load imbalance is a long-standing source of inefficiency in high performance
computing. The situation has only got worse as applications and systems increase in …

被引用次数：6 相关文章

[PDF] acm.org

Itoyori: Reconciling Global Address Space and Global Fork-Join Task Parallelism

S Shiina, K Taura - Proceedings of the International Conference for High …, 2023 - dl.acm.org

This paper introduces Itoyori, a task-parallel runtime system designed to tackle the
challenge of scaling task parallelism (more specifically, nested fork-join parallelism) beyond …

被引用次数：1 相关文章所有 3 个版本

Scalable tasking runtime with parallelized builders for explicit message passing architectures

X Gao, L Chen, H Wang, H Cui, X Feng - Parallel Computing, 2025 - Elsevier

The sequential task flow (STF) model introduces implicit data dependences to exploit task-
based parallelism, simplifying programming but also introducing non-negligible runtime …

[PDF] wiley.com

Toward a Dynamic Allocation Strategy for Deadline‐Oriented Resource and Job Management in HPC Systems

B Linnert, CAF De Rose… - … and Computation: Practice …, 2025 - Wiley Online Library

As high‐performance computing (HPC) becomes a tool used in many different workflows,
quality of service (QoS) becomes increasingly important. In many cases, this includes the …

Automatic aggregation of subtask accesses for nested OpenMP-style tasks

O Shaaban, J Aguilar, V Beltran… - 2022 IEEE 34th …, 2022 - ieeexplore.ieee.org

Task-based programming is a high performance and productive model to express
parallelism. Tasks encapsulate work to be executed across multiple cores or offloaded to …

被引用次数：3 相关文章所有 4 个版本

Towards achieving transparent malleability thanks to mpi process virtualization

H Taboada, R Pereira, J Jaeger, JB Besnard - International Conference on …, 2023 - Springer

Abstract The field of High-Performance Computing is rapidly evolving, driven by the race for
computing power and the emergence of new architectures. Despite these changes, the …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

On Memory Codelets: Prefetching, Recoding, Moving and Streaming Data

D Fox, JM Diaz, X Li - arXiv preprint arXiv:2302.00115, 2023 - arxiv.org

For decades, memory capabilities have scaled up much slower than compute capabilities,
leaving memory utilization as a major bottleneck. Prefetching and cache hierarchies mitigate …

被引用次数：1 相关文章所有 2 个版本

[PDF] hal.science

On the use of hierarchical task for heterogeneous architectures

G Lucas - 2023 - theses.hal.science

In the last decades, the computing power of high-performance platforms has grown
exponentially at the expense of increased complexity. Programming such platforms to take …

被引用次数：2 相关文章所有 4 个版本

[PDF] upc.edu

Transparent load balancing of MPI programs using OmpSs-2@ Cluster and DLB

J Aguilar Mena, O Shaaban, V Lopez… - Proceedings of the 51st …, 2022 - dl.acm.org

Load imbalance is a long-standing source of inefficiency in high performance computing.
The situation has only got worse as applications and systems increase in complexity, eg …

[PDF][PDF] The DEEP-SEA project: a software stack for heterogeneous and modular supercomputers

E Suarez, N Eicker, HC Hoppe - PARS-Mitteilungen, 2024 - dl.gi.de

Today's most powerful supercomputers achieve their performance through heterogeneous
system architectures that integrate CPUs with accelerators, especially GPUs, and advanced …

高级搜索

QQ 群