DeNovo: Rethinking the memory hierarchy for disciplined parallelism

B Choi, R Komuravelli, H Sung… - 2011 International …, 2011 -
For parallelism to become tractable for mass programmers, shared-memory languages and
environments must evolve to enforce disciplined practices that ban" wild shared-memory …

A tagless coherence directory

J Zebchuk, V Srinivasan, MK Qureshi… - Proceedings of the 42nd …, 2009 -
A key challenge in architecting a CMP with many cores is maintaining cache coherence in
an efficient manner. Directory-based protocols avoid the bandwidth overhead of snoop …

Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication

T Krishna, LS Peh, BM Beckmann… - Proceedings of the 44th …, 2011 -
The prevalence of multicore architectures has accentuated the need for scalable cache
coherence solutions. Many of the proposed designs use a mix of 1-to-1, 1-to-many (1-to-M) …

WAYPOINT: scaling coherence to thousand-core architectures

JH Kelm, MR Johnson, SS Lumettta… - Proceedings of the 19th …, 2010 -
In this paper, we evaluate a set of coherence architectures in the context of a 1024-core chip
multiprocessor (CMP) tailored to throughput-oriented parallel workloads. Based on our …

Subspace snooping: Filtering snoops with operating system support

D Kim, J Ahn, J Kim, J Huh - … of the 19th international conference on …, 2010 -
Although snoop-based coherence protocols provide fast cache-to-cache transfers with a
simple and robust coherence mechanism, scaling the protocols has been difficult due to the …

Atomic coherence: Leveraging nanophotonics to build race-free cache coherence protocols

D Vantrease, MH Lipasti… - 2011 IEEE 17th …, 2011 -
This paper advocates Atomic Coherence, a framework that simplifies cache coherence
protocol specification, design, and verification by decoupling races from the protocol's …

Cohesion: a hybrid memory model for accelerators

JH Kelm, DR Johnson, W Tuohy, SS Lumetta… - Proceedings of the 37th …, 2010 -
Two broad classes of memory models are available today: models with hardware cache
coherence, used in conventional chip multiprocessors, and models that rely upon software …

[PDF][PDF] Universitat politecnica de Valencia

A García - Ingeniería del agua, 2014 -
Embedded devices are becoming more and more present everywhere. Moreover mobile
devices are becoming also more computationally powerful. These embedded architectures …

Efficiently supporting dynamic task parallelism on heterogeneous cache-coherent systems

M Wang, T Ta, L Cheng, C Batten - 2020 ACM/IEEE 47th …, 2020 -
Manycore processors, with tens to hundreds of tiny cores but no hardware-based cache
coherence, can offer tremendous peak throughput on highly parallel programs while being …

Adaptive cache coherence mechanisms with producer–consumer sharing optimization for chip multiprocessors

A Kayi, O Serres, T El-Ghazawi - IEEE Transactions on …, 2013 -
In chip multiprocessors (CMPs), maintaining cache coherence can account for a major
performance overhead. Write-invalidate protocols adapted by most CMPs generate high …