Interactions between compression and prefetching in chip multiprocessors

S Mittal - ACM Computing Surveys (CSUR), 2016 - dl.acm.org

As the trends of process scaling make memory systems an even more crucial bottleneck, the
importance of latency hiding techniques such as prefetching grows further. However, naively …

被引用次数：133 相关文章所有 3 个版本

[PDF] academia.edu

A survey of architectural approaches for data compression in cache and main memory systems

S Mittal, JS Vetter - IEEE Transactions on Parallel and …, 2015 - ieeexplore.ieee.org

As the number of cores on a chip increases and key applications become even more data-
intensive, memory systems in modern processors have to deal with increasingly large …

被引用次数：134 相关文章所有 5 个版本

[PDF] umich.edu

C-pack: A high-performance microprocessor cache compression algorithm

X Chen, L Yang, RP Dick, L Shang… - IEEE transactions on …, 2009 - ieeexplore.ieee.org

Microprocessor designers have been torn between tight constraints on the amount of on-
chip cache memory and the high latency of off-chip memory, such as dynamic random …

被引用次数：226 相关文章所有 13 个版本

SC2: A statistical compression cache scheme

A Arelakis, P Stenstrom - ACM SIGARCH Computer Architecture News, 2014 - dl.acm.org

Low utilization of on-chip cache capacity limits performance and wastes energy because of
the long latency, limited bandwidth, and energy consumption associated with off-chip …

被引用次数：146 相关文章所有 6 个版本

A case for core-assisted bottleneck acceleration in GPUs: enabling flexible data compression with assist warps

N Vijaykumar, G Pekhimenko, A Jog… - ACM SIGARCH …, 2015 - dl.acm.org

Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent
execution of thousands of threads. Unfortunately, different bottlenecks during execution and …

被引用次数：129 相关文章所有 6 个版本

[PDF] jaleels.org

PACMan: prefetch-aware cache management for high performance caching

CJ Wu, A Jaleel, M Martonosi, SC Steely Jr… - Proceedings of the 44th …, 2011 - dl.acm.org

Hardware prefetching and last-level cache (LLC) management are two independent
mechanisms to mitigate the growing latency to memory. However, the interaction between …

被引用次数：160 相关文章所有 14 个版本

Daemon: Architectural support for efficient data movement in fully disaggregated systems

C Giannoula, K Huang, J Tang, N Koziris… - Proceedings of the …, 2023 - dl.acm.org

Resource disaggregation offers a cost effective solution to resource scaling, utilization, and
failure-handling in data centers by physically separating hardware devices in a server …

被引用次数：11 相关文章

[PDF] arxiv.org

Understanding and improving the latency of DRAM-based memory systems

KK Chang - 2017 - search.proquest.com

Over the past two decades, the storage capacity and access bandwidth of main memory
have improved tremendously, by 128x and 20x, respectively. These improvements are …

被引用次数：88 相关文章所有 9 个版本

[PDF] umn.edu

Approximate communication: Techniques for reducing communication bottlenecks in large-scale parallel systems

F Betzel, K Khatamifard, H Suresh, DJ Lilja… - ACM Computing …, 2018 - dl.acm.org

Approximate computing has gained research attention recently as a way to increase energy
efficiency and/or performance by exploiting some applications' intrinsic error resiliency …

被引用次数：69 相关文章所有 10 个版本

[PDF] cmu.edu

A case for toggle-aware compression for GPU systems

G Pekhimenko, E Bolotin, N Vijaykumar… - … Symposium on High …, 2016 - ieeexplore.ieee.org

Data compression can be an effective method to achieve higher system performance and
energy efficiency in modern data-intensive applications by exploiting redundancy and data …

被引用次数：91 相关文章所有 20 个版本

高级搜索

QQ 群