A Distributed Scheme for Fault-Tolerance in Large Clusters of Workstations.

Contention awareness and fault-tolerant scheduling for precedence constrained tasks in heterogeneous systems

A Benoit, M Hakem, Y Robert - Parallel Computing, 2009 - Elsevier

Heterogeneous distributed systems are widely deployed for executing computationally
intensive parallel applications with diverse computing needs. Such environments require …

被引用次数：77 相关文章所有 7 个版本

[PDF] academia.edu

An intelligent management of fault tolerance in cluster using RADICMPI

A Duarte, D Rexachs, E Luque - European Parallel Virtual Machine …, 2006 - Springer

Independence of special elements, transparency and scalability are very significant features
required from the fault tolerance schemes for modern clusters of computers. In order to …

被引用次数：27 相关文章所有 17 个版本

[PDF] arxiv.org

Optimizing latency and reliability of pipeline workflow applications

A Benoit, V Rehn-Sonigo… - 2008 IEEE International …, 2008 - ieeexplore.ieee.org

Mapping applications onto heterogeneous platforms is a difficult challenge, even for simple
application patterns such as pipeline graphs. The problem is even more complex when …

被引用次数：17 相关文章所有 19 个版本

A fault tolerant approach in cluster computing system

T Shwe, W Aye - 2008 5th International Conference on …, 2008 - ieeexplore.ieee.org

A long-term trend in high performance computing is the increasing number of nodes in
parallel computing platforms, which entails a higher failure probability. Hence, fault …

被引用次数：12 相关文章所有 2 个版本

[PDF] uab.cat

[图书][B] RADIC: a powerful fault-tolerant architecture

AA Duarte - 2007 - ddd.uab.cat

La tolerancia a fallos se ha convertido en un requerimiento importante para los ingenieros
informáticos y los desarrolladores de software, debido a que la ocurrencia de fallos …

被引用次数：13 相关文章所有 3 个版本

[PDF] archives-ouvertes.fr

Who needs a scheduler?

A Benoit, L Marchal, Y Robert - 2008 - hal-lara.archives-ouvertes.fr

This position paper advocates the need for scheduling. Even if resources at our disposal
would become abundant and cheap, not to say unlimited and free (a~ perspective that is not …

被引用次数：5 相关文章所有 9 个版本

Algorithms and scheduling techniques for clusters and grids

A Benoit, L Marchal, Y Robert… - High Speed and Large …, 2009 - ebooks.iospress.nl

The main objective of this chapter is to show the need for algorithmic and scheduling
techniques. Even if resources at our disposal would become abundant and cheap, not to say …

被引用次数：3 相关文章所有 2 个版本

Functional tests of the RADIC fault tolerance architecture

A Duarte, D Rexachs, E Luque - … International Conference on …, 2007 - ieeexplore.ieee.org

Clusters with thousand of nodes are a reality and the current trend indicates that they are
becoming larger. Such large clusters are subject to a relatively high fault frequency so a fault …

被引用次数：4 相关文章所有 4 个版本

Scheduling for numerical linear algebra library at scale

J Kurzak, H Ltaief, JJ Dongarra… - High Speed and Large …, 2009 - ebooks.iospress.nl

State-of-the-art dense linear algebra software, such as the LAPACK and ScaLAPACK
libraries, suffer performance losses on multicore processors due to their inability to fully …

被引用次数：3 相关文章所有 3 个版本

[PDF] unlp.edu.ar

Recuperando prestaciones en clusters tras ocurrencia de fallos utilizando RADIC

GA Santos, A Duarte… - … de Ciencias de la …, 2006 - sedici.unlp.edu.ar

Tras la recuperación de un fallo, las aplicaciones pierden prestaciones debido, en gran
parte, a que el número planificado de nodos ha disminuido y de la pérdida que provoca la …

被引用次数：3 相关文章所有 6 个版本

高级搜索

QQ 群