A short survey of commercial cluster batch schedulers

MB Qureshi, MM Dehnavi, N Min-Allah… - Journal of Grid …, 2014 - Springer

Grid is a distributed high performance computing paradigm that offers various types of
resources (like computing, storage, communication) to resource-intensive user tasks. These …

被引用次数：129 相关文章所有 17 个版本

[PDF] huji.ac.il

Backfilling using system-generated predictions rather than user runtime estimates

D Tsafrir, Y Etsion, DG Feitelson - IEEE Transactions on …, 2007 - ieeexplore.ieee.org

The most commonly used scheduling algorithm for parallel supercomputers is FCFS with
backfilling, as originally introduced in the EASY scheduler. Backfilling means that short jobs …

被引用次数：536 相关文章所有 15 个版本

[PDF] acm.org

Optimal scheduling in the multiserver-job model under heavy traffic

I Grosof, Z Scully, M Harchol-Balter… - Proceedings of the ACM …, 2022 - dl.acm.org

Multiserver-job systems, where jobs require concurrent service at many servers, occur
widely in practice. Essentially all of the theoretical work on multiserver-job systems focuses …

被引用次数：29 相关文章所有 8 个版本

[PDF] arxiv.org

The RESET and MARC techniques, with application to multiserver-job analysis

I Grosof, Y Hong, M Harchol-Balter… - Performance …, 2023 - Elsevier

Abstract Multiserver-job (MSJ) systems, where jobs need to run concurrently across many
servers, are increasingly common in practice. The default service ordering in many settings …

被引用次数：15 相关文章所有 6 个版本

[PDF] psu.edu

Modeling user runtime estimates

D Tsafrir, Y Etsion, DG Feitelson - … 2005, Cambridge, MA, USA, June 19 …, 2005 - Springer

User estimates of job runtimes have emerged as an important component of the workload on
parallel machines, and can have a significant impact on how a scheduler treats different …

被引用次数：169 相关文章所有 16 个版本

Priority-based consolidation of parallel workloads in the cloud

X Liu, C Wang, BB Zhou, J Chen… - … on Parallel and …, 2012 - ieeexplore.ieee.org

The cloud computing paradigm is attracting an increased number of complex applications to
run in remote data centers. Many complex applications require parallel processing …

被引用次数：111 相关文章所有 5 个版本

[PDF] googleapis.com

Distributed job manager recovery

JR Challenger, LR Degenaro, JR Giles… - US Patent …, 2010 - Google Patents

(57) ABSTRACT A method is provided for the recovery of an instance of a job manager
running on one of a plurality of nodes used to execute the processing elements associated …

被引用次数：133 相关文章所有 4 个版本

[PDF] derby.ac.uk

Exploring decentralized dynamic scheduling for grids and clouds using the community-aware scheduling algorithm

Y Huang, N Bessis, P Norrington, P Kuonen… - Future Generation …, 2013 - Elsevier

Job scheduling strategies have been studied for decades in a variety of scenarios. Due to
the new characteristics of the emerging computational systems, such as the grid and cloud …

被引用次数：114 相关文章所有 10 个版本

[PDF] sciencedirect.com

Towards understanding HPC users and systems: a NERSC case study

GP Rodrigo, PO Östberg, E Elmroth, K Antypas… - Journal of Parallel and …, 2018 - Elsevier

High performance computing (HPC) scheduling landscape currently faces new challenges
due to the changes in the workload. Previously, HPC centers were dominated by tightly …

被引用次数：74 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] A machine learning approach for an HPC use case: The jobs queuing time prediction

C Vercellino, A Scionti, G Varavallo, P Viviani… - Future Generation …, 2023 - Elsevier

Abstract High-Performance Computing (HPC) domain provided the necessary tools to
support the scientific and industrial advancements we all have seen during the last decades …

被引用次数：12 相关文章所有 4 个版本

高级搜索

QQ 群