Complexity-effective multicore coherence

A Ros, S Kaxiras - Proceedings of the 21st international conference on …, 2012 - dl.acm.org
Much of the complexity and overhead (directory, state bits, invalidations) of a typical
directory coherence implementation stems from the effort to make it" invisible" even to the …

SCD: A scalable coherence directory with flexible sharer set encoding

D Sanchez, C Kozyrakis - IEEE International Symposium on …, 2012 - ieeexplore.ieee.org
Large-scale CMPs with hundreds of cores require a directory-based protocol to maintain
cache coherence. However, previously proposed coherence directories are hard to scale …

Hardware transactional memory for GPU architectures

WWL Fung, I Singh, A Brownsword… - Proceedings of the 44th …, 2011 - dl.acm.org
Graphics processor units (GPUs) are designed to efficiently exploit thread level parallelism
(TLP), multiplexing execution of 1000s of concurrent threads on a relatively smaller set of …

System and method for simplifying cache coherence using multiple write policies

S Kaxiras, A Ros - US Patent 9,274,960, 2016 - Google Patents
Abstract System and methods for cache coherence in a multi-core processing environment
having a local/shared cache hierarchy. The system includes multiple processor cores, a …

TSO-CC: Consistency directed cache coherence for TSO

M Elver, V Nagarajan - 2014 IEEE 20th International …, 2014 - ieeexplore.ieee.org
Traditional directory coherence protocols are designed for the strictest consistency model,
sequential consistency (SC). When they are used for chip multiprocessors (CMPs) that …

Coherence domain restriction on large scale systems

Y Fu, TM Nguyen, D Wentzlaff - … of the 48th International Symposium on …, 2015 - dl.acm.org
Designing massive scale cache coherence systems has been an elusive goal. Whether it be
on large-scale GPUs, future thousand-core chips, or across million-core warehouse scale …

Generating efficient data movement code for heterogeneous architectures with distributed-memory

R Dathathri, C Reddy, T Ramashekar… - Proceedings of the …, 2013 - ieeexplore.ieee.org
Programming for parallel architectures that do not have a shared address space is extremely
difficult due to the need for explicit communication between memories of different compute …

Multi-grain coherence directories

J Zebchuk, B Falsafi, A Moshovos - … of the 46th Annual IEEE/ACM …, 2013 - dl.acm.org
Conventional directory coherence operates at the finest granularity possible, that of a cache
block. While simple, this organization fails to exploit frequent application behavior: at any …

SPATL: Honey, I shrunk the coherence directory

H Zhao, A Shriraman, S Dwarkadas… - 2011 International …, 2011 - ieeexplore.ieee.org
One of the key scalability challenges of on-chip coherence in a multicore chip is the
coherence directory, which provides information on sharing of cache blocks. Shadow tags …

TimeCache: using time to eliminate cache side channels when sharing software

D Ojha, S Dwarkadas - 2021 ACM/IEEE 48th Annual …, 2021 - ieeexplore.ieee.org
Timing side channels have been used to extract cryptographic keys and sensitive
documents even from trusted enclaves. Specifically, cache side channels created by reuse …