Irregular workloads are typically bottlenecked by the memory system. These workloads often use sparse data representations, eg, compressed sparse row/column (CSR/CSC), to …
Y Wang, L Orosa, X Peng, Y Guo… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
Main memory, composed of DRAM, is a performance bottleneck for many applications, due to the high DRAM access latency. In-DRAM caches work to mitigate this latency by …
Modern data center applications have rapidly expanding instruction footprints that lead to frequent instruction cache misses, increasing cost and degrading data center performance …
As much of the world's computing continues to move into the cloud, the overprovisioning of computing resources to ensure the performance isolation of latency-sensitive tasks, such as …
Applications extensively use data objects with a regular and fixed layout, which leads to the recurrence of access patterns over memory regions. Spatial data prefetching techniques …
High main memory latency continues to limit performance of modern high-performance out- of-order cores. While DRAM latency has remained nearly the same over many generations …
The large instruction working sets of private and public cloud workloads lead to frequent instruction cache misses and costs in the millions of dollars. While prior work has identified …
TA Khan, D Zhang, A Sriraman… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Modern data center applications exhibit deep software stacks, resulting in large instruction footprints that frequently cause instruction cache misses degrading performance, cost, and …
Disaggregated memory can address resource provisioning inefficiencies in current datacenters. Multiple software runtimes for disaggregated memory have been proposed in …