Machine learning in compiler optimization

Z Wang, M O'Boyle - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org
In the last decade, machine-learning-based compilation has moved from an obscure
research niche to a mainstream activity. In this paper, we describe the relationship between …

Thread reinforcer: Dynamically determining number of threads via os level monitoring

KK Pusukuri, R Gupta… - 2011 IEEE International …, 2011 - ieeexplore.ieee.org
It is often assumed that to maximize the performance of a multithreaded application, the
number of threads created should equal the number of cores. While this may be true for …

Cache conscious task regrouping on multicore processors

X Xiang, B Bao, C Ding, K Shen - 2012 12th IEEE/ACM …, 2012 - ieeexplore.ieee.org
Because of the interference in the shared cache on multicore processors, the performance of
a program can be severely affected by its co-running programs. If job scheduling does not …

Artificial synesthesia via sonification: A wearable augmented sensory system

LN Foner - Mobile Networks and Applications, 1999 - Springer
A design for an implemented, prototype wearable artificial sensory system is presented,
which uses data sonification to compensate for normal limitations in the human visual …

A practical approach for performance analysis of shared-memory programs

BM Tudor, YM Teo - 2011 IEEE International Parallel & …, 2011 - ieeexplore.ieee.org
Parallel programming has transcended from HPC into mainstream, enabled by a growing
number of programming models, languages and methodologies, as well as the availability of …

Critical path-based thread placement for numa systems

CY Su, D Li, DS Nikolopoulos, M Grove… - ACM SIGMETRICS …, 2012 - dl.acm.org
Multicore multiprocessors use a Non Uniform Memory Architecture (NUMA) to improve their
scalability. However, NUMA introduces performance penalties due to remote memory …

Model-based, memory-centric performance and power optimization on numa multiprocessors

CY Su, D Li, DS Nikolopoulos… - 2012 IEEE …, 2012 - ieeexplore.ieee.org
Non-Uniform Memory Access (NUMA) architectures are ubiquitous in HPC systems. NUMA
along with other factors including socket layout, data placement, and memory contention …

Sparse grid regression for performance prediction using high-dimensional run time data

P Neumann - Euro-Par 2019: Parallel Processing Workshops: Euro …, 2020 - Springer
We employ sparse grid regression to predict the run time in three types of numerical
simulation: molecular dynamics (MD), weather and climate simulation. The impact of …

[HTML][HTML] Optimization of the size of thread pool in runtime systems to enterprise application integration: a mathematical modelling approach

DL Freire, RZ Frantz, F Roos-Frantz, S Sawicki - TEMA (São Carlos), 2019 - SciELO Brasil
Companies seek technological alternatives that provide competitiveness for their business
processes. One of them is integration platforms, software tools that build integration …

[PDF][PDF] Prediction strategies for power-aware computing on multicore processors

K Singh - 2009 - ecommons.cornell.edu
Diminishing performance returns and increasing power consumption of single-threaded
processors have made chip multiprocessors (CMPs) an industry imperative. Unfortunately …