S Mittal, JS Vetter - IEEE Transactions on Parallel and …, 2015 - ieeexplore.ieee.org
As the number of cores on a chip increases and key applications become even more data- intensive, memory systems in modern processors have to deal with increasingly large …
X Chen, L Yang, RP Dick, L Shang… - IEEE transactions on …, 2009 - ieeexplore.ieee.org
Microprocessor designers have been torn between tight constraints on the amount of on- chip cache memory and the high latency of off-chip memory, such as dynamic random …
Low utilization of on-chip cache capacity limits performance and wastes energy because of the long latency, limited bandwidth, and energy consumption associated with off-chip …
Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and …
CJ Wu, A Jaleel, M Martonosi, SC Steely Jr… - Proceedings of the 44th …, 2011 - dl.acm.org
Hardware prefetching and last-level cache (LLC) management are two independent mechanisms to mitigate the growing latency to memory. However, the interaction between …
C Giannoula, K Huang, J Tang, N Koziris… - Proceedings of the …, 2023 - dl.acm.org
Resource disaggregation offers a cost effective solution to resource scaling, utilization, and failure-handling in data centers by physically separating hardware devices in a server …
Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are …
Approximate computing has gained research attention recently as a way to increase energy efficiency and/or performance by exploiting some applications' intrinsic error resiliency …
Data compression can be an effective method to achieve higher system performance and energy efficiency in modern data-intensive applications by exploiting redundancy and data …