Glocks: Efficient support for highly-contended locks in many-core cmps

T David, R Guerraoui, V Trigonakis - Proceedings of the Twenty-Fourth …, 2013 - dl.acm.org

This paper presents the most exhaustive study of synchronization to date. We span multiple
layers, from hardware cache-coherence protocols up to high-level concurrent software. We …

被引用次数：335 相关文章所有 29 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

被引用次数：96 相关文章所有 13 个版本网页快照

[PDF] sci-hub [PDF] usenix.org [ 下载加速 ]

Remote core locking: Migrating {Critical-Section} execution to improve the performance of multithreaded applications

JP Lozi, F David, G Thomas, J Lawall… - 2012 USENIX Annual …, 2012 - usenix.org

The scalability of multithreaded applications on current multicore systems is hampered by
the performance of lock algorithms, due to the costs of access contention and cache misses …

被引用次数：210 相关文章所有 23 个版本网页快照

[PDF] sci-hub [PDF] uic.edu [ 下载加速 ]

Ffwd: Delegation is (much) faster than you think

S Roghanchi, J Eriksson, N Basu - … of the 26th Symposium on Operating …, 2017 - dl.acm.org

We revisit the question of delegation vs. synchronized access to shared memory, and show
through analysis and demonstration that delegation can be much faster than locking under a …

被引用次数：49 相关文章所有 8 个版本网页快照

[PDF] sci-hub [PDF] acm.org [ 下载加速 ]

WiSync: An architecture for fast synchronization through on-chip wireless communication

S Abadal, A Cabellos-Aparicio, E Alarcon… - ACM SIGPLAN …, 2016 - dl.acm.org

In shared-memory multiprocessing, fine-grain synchronization is challenging because it
requires frequent communication. As technology scaling delivers larger manycore chips …

被引用次数：54 相关文章所有 14 个版本网页快照

[PDF] sci-hub [PDF] acm.org Full View [ 下载加速 ]

Adaptive contention management for fine-grained synchronization on commodity GPUs

L Gao, J Wang, W Zhang - ACM Transactions on Architecture and Code …, 2022 - dl.acm.org

As more emerging applications are moving to GPUs, fine-grained synchronization has
become imperative. However, their performance can be severely impaired in case of …

被引用次数：16 相关文章网页快照

[PDF] sci-hub [PDF] hal.science [ 下载加速 ]

Fast and portable locking for multicore architectures

JP Lozi, F David, G Thomas, J Lawall… - ACM Transactions on …, 2016 - dl.acm.org

The scalability of multithreaded applications on current multicore systems is hampered by
the performance of lock algorithms, due to the costs of access contention and cache misses …

被引用次数：41 相关文章所有 14 个版本网页快照

[PDF] sci-hub

Efficient hardware barrier synchronization in many-core cmps

JL Abellán, J Fernández… - IEEE Transactions on …, 2011 - ieeexplore.ieee.org

Traditional software-based barrier implementations for shared memory parallel machines
tend to produce hotspots in terms of memory and network contention as the number of …

被引用次数：36 相关文章所有 5 个版本网页快照

[PDF] sci-hub

MiSAR: Minimalistic synchronization accelerator with resource overflow management

CK Liang, M Prvulovic - ACM SIGARCH Computer Architecture News, 2015 - dl.acm.org

While numerous hardware synchronization mechanisms have been proposed, they either
no longer function or suffer great performance loss when their hardware resources are …

被引用次数：26 相关文章所有 5 个版本网页快照

[PDF] sci-hub

Scalable adaptive NUMA-aware lock

M Zhang, H Chen, L Cheng, FCM Lau… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org

Scalable locking is a key building block for scalable multi-threaded software. Its performance
is especially critical in multi-socket, multi-core machines with non-uniform memory access …

被引用次数：19 相关文章所有 3 个版本网页快照

高级搜索

QQ 群