Dark silicon aware runtime mapping for many-core systems: A patterning approach

A Kanduri, MH Haghbayan… - 2015 33rd IEEE …, 2015 - ieeexplore.ieee.org
Limitation on power budget in many-core systems leaves a fraction of on-chip resources
inactive, referred to as dark silicon. In such systems, an efficient run-time application …

adBoost: Thermal aware performance boosting through dark silicon patterning

AM Rahmani, M Shafique, A Jantsch… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Increasing power densities of many-core systems leaves a fraction of on-chip resources
inactive, referred to as dark silicon. Efficient management of critical interlinked parameters …

Adjustable contiguity of run-time task allocation in networked many-core systems

M Fattah, P Liljeberg, J Plosila… - 2014 19th Asia and …, 2014 - ieeexplore.ieee.org
In this paper, we propose a run-time mapping algorithm, CASqA, for networked many-core
systems. In this algorithm, the level of contiguousness of the allocated processors (α) can be …

Pacmap: Topology mapping of unstructured communication patterns onto non-contiguous allocations

O Tuncer, VJ Leung, AK Coskun - Proceedings of the 29th ACM on …, 2015 - dl.acm.org
In high performance computing (HPC), applications usually have many parallel tasks
running on multiple machine nodes. As these tasks intensively communicate with each …

Simulation and optimization of HPC job allocation for jointly reducing communication and cooling costs

J Meng, S McCauley, F Kaplan, VJ Leung… - … Informatics and Systems, 2015 - Elsevier
Performance and energy are critical aspects in high performance computing (HPC) data
centers. Highly parallel HPC applications that require multiple nodes usually run for long …

Task scheduling for many-cores with S-NUCA caches

A Pathania, J Henkel - 2018 Design, Automation & Test in …, 2018 - ieeexplore.ieee.org
A many-core processor may comprise a large number of processing cores on a single chip.
The many-core's last-level shared cache can potentially be physically distributed alongside …

Efficient top-k spatial locality search for co-located spatial web objects

Q Qu, S Liu, B Yang, CS Jensen - 2014 IEEE 15th International …, 2014 - ieeexplore.ieee.org
In step with the web being used widely by mobile users, user location is becoming an
essential signal in services, including local intent search. Given a large set of spatial web …

Parallel job scheduling policies to improve fairness: A case study

VJ Leung, G Sabin… - 2010 39th International …, 2010 - ieeexplore.ieee.org
Balancing fairness, user performance, and system performance is a critical concern when
developing and installing parallel schedulers. Sandia uses a customized scheduler to …

A multi-faceted approach to job placement for improved performance on extreme-scale systems

C Zimmer, S Gupta, S Atchley… - SC'16: Proceedings …, 2016 - ieeexplore.ieee.org
Job placement plays a pivotal role in application performance on supercomputers. We
present a multi-faceted exploration to influence placement in extreme-scale systems, to …

Communication and cooling aware job allocation in data centers for communication-intensive workloads

J Meng, E Llamosí, F Kaplan, C Zhang, J Sheng… - Journal of Parallel and …, 2016 - Elsevier
Energy consumption is an increasingly important concern in data centers. Today, nearly half
of the energy in data centers is consumed by the cooling infrastructure. Existing policies on …