A machine learning-based approach for thread mapping on transactional memory applications

Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture

S Usman, R Mehmood, I Katib, A Albeshri - Electronics, 2022 - mdpi.com

Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …

被引用次数：21 相关文章所有 5 个版本

[PDF] arxiv.org

Machine learning in compiler optimization

Z Wang, M O'Boyle - Proceedings of the IEEE, 2018 - ieeexplore.ieee.org

In the last decade, machine-learning-based compilation has moved from an obscure
research niche to a mainstream activity. In this paper, we describe the relationship between …

被引用次数：262 相关文章所有 10 个版本

[PDF] springer.com

Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review

S Memeti, S Pllana, A Binotto, J Kołodziej, I Brandic - Computing, 2019 - Springer

While modern parallel computing systems offer high performance, utilizing these powerful
computing resources to the highest possible extent demands advanced knowledge of …

被引用次数：71 相关文章所有 13 个版本

[PDF] arxiv.org

Learning intermediate representations using graph neural networks for numa and prefetchers optimization

A TehraniJamsaz, M Popov, A Dutta… - 2022 IEEE …, 2022 - ieeexplore.ieee.org

There is a large space of NUMA and hardware prefetcher configurations that can
significantly impact the performance of an application. Previous studies have demonstrated …

被引用次数：18 相关文章所有 8 个版本

[PDF] acm.org

Modeling and optimizing numa effects and prefetching with machine learning

I Sánchez Barrera, D Black-Schaffer, M Casas… - Proceedings of the 34th …, 2020 - dl.acm.org

Both NUMA thread/data placement and hardware prefetcher configuration have significant
impacts on HPC performance. Optimizing both together leads to a large and complex design …

被引用次数：37 相关文章所有 3 个版本

[PDF] psu.edu

Compiler support for selective page migration in NUMA architectures

G Piccoli, HN Santos, RE Rodrigues, C Pousa… - Proceedings of the 23rd …, 2014 - dl.acm.org

Current high-performance multicore processors provide users with a non-uniform memory
access model (NUMA). These systems perform better when threads access data on memory …

被引用次数：53 相关文章所有 8 个版本

[PDF] academia.edu

Machine learning-based self-adjusting concurrency in software transactional memory systems

D Rughetti, P Di Sanzo, B Ciciani… - 2012 IEEE 20th …, 2012 - ieeexplore.ieee.org

One of the problems of Software-Transactional-Memory (STM) systems is the performance
degradation that can be experienced when applications run with a non-optimal concurrency …

被引用次数：66 相关文章所有 10 个版本

[PDF] hal.science

Data and thread placement in numa architectures: A statistical learning approach

N Denoyelle, B Goglin, E Jeannot… - Proceedings of the 48th …, 2019 - dl.acm.org

Nowadays, NUMA architectures are common in compute-intensive systems. Achieving high
performance for multi-threaded application requires both a careful placement of threads on …

被引用次数：31 相关文章所有 13 个版本

[PDF] ieee.org

ZAKI+: A machine learning based process mapping tool for SpMV computations on distributed memory architectures

S Usman, R Mehmood, I Katib, A Albeshri - IEEE Access, 2019 - ieeexplore.ieee.org

Smart cities and other cyber-physical systems (CPSs) rely on various scientific, engineering,
business, and social applications that provide timely intelligence for their design, operations …

被引用次数：27 相关文章所有 4 个版本

[PDF] acm.org

Mirencoder: Multi-modal ir-based pretrained embeddings for performance optimizations

A Dutta, A Jannesari - Proceedings of the 2024 International Conference …, 2024 - dl.acm.org

One of the primary areas of interest in High Performance Computing is the improvement of
performance of parallel workloads. Nowadays, compilable source code-based optimization …

被引用次数：1 相关文章所有 9 个版本

高级搜索

QQ 群