Mitigating processor variation through dynamic load balancing

B Acun, LV Kale - 2016 IEEE International Parallel and …, 2016 - ieeexplore.ieee.org
2016 IEEE International Parallel and Distributed Processing …, 2016ieeexplore.ieee.org
There can be performance variation among same-model processors in large scale clusters,
and supercomputers that are caused by power, and temperature variations among the
processors. These variations manifest itself as frequency difference of the processors under
dynamic overclocking, such as Turbo Boost. Different-model processors also create an
inherent variation when used in same cluster. For some tightly coupled HPC applications
even one slow processor in the critical path can slow down the whole application therefore …
There can be performance variation among same-model processors in large scale clusters, and supercomputers that are caused by power, and temperature variations among the processors. These variations manifest itself as frequency difference of the processors under dynamic overclocking, such as Turbo Boost. Different-model processors also create an inherent variation when used in same cluster. For some tightly coupled HPC applications even one slow processor in the critical path can slow down the whole application therefore this variation is an important problem. To mitigate the performance variation among processors, we propose a speed-aware dynamic load balancing strategy which works on both homogeneous and non-homogeneous hardware. Our main idea is to provide an estimation of the task completion time based when moving a task from one processor to another on the processor speed. We show up to 30% performance improvement using our speed-aware load balancer compared to the no load balancing case. We also show that our speed-aware balancer performs 5% better than non-speed aware counterpart.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果