Two algorithms for barrier synchronization

M Herlihy, N Shavit, V Luchangco, M Spear - 2020 - books.google.com

The Art of Multiprocessor Programming, Second Edition, provides users with an authoritative
guide to multicore programming. This updated edition introduces higher level software …

被引用次数：2558 相关文章所有 10 个版本

[PDF] acm.org

Algorithms for scalable synchronization on shared-memory multiprocessors

JM Mellor-Crummey, ML Scott - ACM Transactions on Computer …, 1991 - dl.acm.org

Busy-wait techniques are heavily used for mutual exclusion and barrier synchronization in
shared-memory parallel programs. Unfortunately, typical implementations of busy-waiting …

被引用次数：1991 相关文章所有 60 个版本

[PDF] psu.edu

Optimization of collective communication operations in MPICH

R Thakur, R Rabenseifner… - The International Journal …, 2005 - journals.sagepub.com

We describe our work on improving the performance of collective communication operations
in MPICH for clusters connected by switched networks. For each collective operation, we …

被引用次数：1223 相关文章所有 15 个版本

[PDF] acm.org

Memory coherence in shared virtual memory systems

K Li, P Hudak - ACM Transactions on Computer Systems (TOCS), 1989 - dl.acm.org

The memory coherence problem in designing and implementing a shared virtual memory on
loosely coupled multiprocessors is studied in depth. Two classes of algorithms, centralized …

被引用次数：2259 相关文章所有 83 个版本

[PDF] wordpress.com

[图书][B] Program synthesis by sketching

A Solar-Lezama - 2008 - search.proquest.com

The goal of software synthesis is to generate programs automatically from high-level
specifications. However, efficient implementations for challenging programs require a …

被引用次数：558 相关文章所有 7 个版本

[PDF] arxiv.org

Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

被引用次数：96 相关文章所有 13 个版本

[PDF] acm.org

Stateful Serverless Computing with Crucial

D Barcelona-Pons, P Sutra… - ACM Transactions on …, 2022 - dl.acm.org

Serverless computing greatly simplifies the use of cloud resources. In particular, Function-as-
a-Service (FaaS) platforms enable programmers to develop applications as individual …

被引用次数：63 相关文章所有 5 个版本

[PDF] acm.org

Implementation and performance of Munin

JB Carter, JK Bennett, W Zwaenepoel - ACM SIGOPS Operating …, 1991 - dl.acm.org

Munin is a distributed shared memory (DSM) system that allows shared memory parallel
programs to be executed efficiently on distributed memory multiprocessors. Munin is unique …

被引用次数：1175 相关文章所有 26 个版本

[PDF] arxiv.org

SparCML: High-performance sparse communication for machine learning

C Renggli, S Ashkboos, M Aghagolzadeh… - Proceedings of the …, 2019 - dl.acm.org

Applying machine learning techniques to the quickly growing data in science and industry
requires highly-scalable algorithms. Large datasets are most commonly processed" data …

被引用次数：152 相关文章所有 22 个版本

[PDF] arxiv.org

Near-optimal sparse allreduce for distributed deep learning

S Li, T Hoefler - Proceedings of the 27th ACM SIGPLAN Symposium on …, 2022 - dl.acm.org

Communication overhead is one of the major obstacles to train large deep learning models
at scale. Gradient sparsification is a promising technique to reduce the communication …

被引用次数：50 相关文章所有 30 个版本

高级搜索

QQ 群