Predictive performance modeling for distributed batch processing using black box monitoring and machine learning

C Witt, M Bux, W Gusew, U Leser - Information Systems, 2019 - Elsevier
In many domains, the previous decade was characterized by increasing data volumes and
growing complexity of data analyses, creating new demands for batch processing on …

[PDF][PDF] A guide to dynamic load balancing in distributed computer systems

AM Alakeel - International journal of computer science and …, 2010 - researchgate.net
Load balancing is the process of redistributing the work load among nodes of the distributed
system to improve both resource utilization and job response time while also avoiding a …

Predicting application run times using historical information

W Smith, I Foster, V Taylor - … Strategies for Parallel Processing: IPPS/SPDP …, 1998 - Springer
We present a technique for deriving predictions for the run times of parallel applications from
the run times of “similar” applications that have executed in the past. The novel aspect of our …

Statistical prediction of task execution times through analytic benchmarking for scheduling in a heterogeneous environment

MA Iverson, F Ozguner, LC Potter - … Computing Workshop (HCW …, 1999 - ieeexplore.ieee.org
In this paper a method for estimating task execution times is presented, in order to facilitate
dynamic scheduling in a heterogeneous metacomputing environment. Execution time is …

A historical application profiler for use by parallel schedulers

R Gibbons - Workshop on Job Scheduling Strategies for Parallel …, 1997 - Springer
Scheduling algorithms that use application and system knowledge have been shown to be
more effective at scheduling parallel jobs on a multiprocessor than algorithms that do not …

A measurement-based model for estimation of resource exhaustion in operational software systems

K Vaidyanathan, KS Trivedi - Proceedings 10th International …, 1999 - ieeexplore.ieee.org
Software systems are known to suffer from outages due to transient errors. Recently, the
phenomenon of" software aging", in which the state of the software system degrades with …

Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860

DG Feitelson, B Nitzberg - workshop on job scheduling strategies for …, 1995 - Springer
Statistics of a parallel workload on a 128-node iPSC/860 located at NASA Ames are
presented. It is shown that while the number of sequential jobs dominates the number of …

Predicting queue times on space-sharing parallel computers

AB Downey - Proceedings 11th International Parallel …, 1997 - ieeexplore.ieee.org
We present statistical techniques for predicting the queue times experienced by jobs
submitted to a space-sharing parallel machine with first-come-first-served (FCFS) …

Implementing a performance forecasting system for metacomputing: The network weather service

R Wolski, N Spring, C Peterson - Proceedings of the 1997 ACM/IEEE …, 1997 - dl.acm.org
In this paper we describe the design and implementation of a system called the Network
Weather Service (NWS) that takes periodic measurements of deliverable resource …

Predictive application-performance modeling in a computational grid environment

NH Kapadia, JAB Fortes… - Proceedings. The Eighth …, 1999 - ieeexplore.ieee.org
This paper describes and evaluates the application of three local learning algorithms-
nearest-neighbor, weighted-average, and locally-weighted polynomial regression-for the …