Improving CC-NUMA performance using instruction-based prediction

J Chang, GS Sohi - ACM SIGARCH Computer Architecture News, 2006 - dl.acm.org

This paper presents CMP Cooperative Caching, a unified framework to manage a CMP's
aggregate on-chip cache resources. Cooperative caching combines the strengths of private …

被引用次数：579 相关文章所有 36 个版本

[PDF] psu.edu

Transactional lock-free execution of lock-based programs

R Rajwar, JR Goodman - ACM SIGOPS Operating Systems Review, 2002 - dl.acm.org

This paper is motivated by the difficulty in writing correct high-performance programs. Writing
shared-memory multi-threaded programs imposes a complex trade-off between …

被引用次数：454 相关文章所有 32 个版本

[PDF] wisc.edu

Token coherence: Decoupling performance and correctness

MMK Martin, MD Hill, DA Wood - ACM SIGARCH Computer Architecture …, 2003 - dl.acm.org

Many future shared-memory multiprocessor servers will both target commercial workloads
and use highly-integrated" glueless" designs. Implementing low-latency cache coherence in …

被引用次数：425 相关文章所有 33 个版本

[PDF] iitd.ac.in

Performance pathologies in hardware transactional memory

J Bobba, KE Moore, H Volos, L Yen, MD Hill… - ACM SIGARCH …, 2007 - dl.acm.org

Hardware Transactional Memory (HTM) systems reflect choices from three key design
dimensions: conflict detection, version management, and conflict resolution. Previously …

被引用次数：327 相关文章所有 17 个版本

[PDF] acm.org

Selective, accurate, and timely self-invalidation using last-touch prediction

AC Lai, B Falsafi - ACM SIGARCH Computer Architecture News, 2000 - dl.acm.org

Communication in cache-coherent distributed shared memory (DSM) often requires
invalidating (or writing back) cached copies of a memory block, incurring high overheads …

被引用次数：219 相关文章所有 20 个版本

[图书][B] A primer on hardware prefetching

B Falsafi, TF Wenisch - 2022 - books.google.com

Since the 1970's, microprocessor-based digital platforms have been riding Moore's law,
allowing for doubling of density for the same area roughly every two years. However …

被引用次数：118 相关文章所有 7 个版本

[PDF] epfl.ch

Accurate and complexity-effective spatial pattern prediction

CF Chen, SH Yang, B Falsafi… - … Symposium on High …, 2004 - ieeexplore.ieee.org

Recent research suggests that there are large variations in a cache's spatial usage, both
within and across programs. Unfortunately, conventional caches typically employ fixed …

被引用次数：152 相关文章所有 18 个版本

[PDF] wisconsin.edu

Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors

MMK Martin, PJ Harper, DJ Sorin, MD Hill… - Proceedings of the 30th …, 2003 - dl.acm.org

Destination-set prediction can improve the latency/bandwidth tradeoff in shared-memory
multiprocessors. The destination set is the collection of processors that receive a particular …

被引用次数：175 相关文章所有 26 个版本

[PDF] googleapis.com

Methods to perform disk writes in a distributed shared disk system needing consistency across failures

S Chandrasekaran, RJ Bamford, WH Bridge… - US Patent …, 2007 - Google Patents

Techniques are provided for managing caches in a system with multiple caches that may
contain different copies of the same data item. Specifically, techniques are provided for …

被引用次数：105 相关文章所有 4 个版本

[PDF] researchgate.net

SARC coherence: Scaling directory cache coherence in performance and power

S Kaxiras, G Keramidas - IEEE micro, 2010 - ieeexplore.ieee.org

The SARC project seeks to improve power scalability of shared-memory chip
multiprocessors (CMPs) by making directory coherence more efficient in both power and …

被引用次数：91 相关文章所有 15 个版本

高级搜索

QQ 群