A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

A survey of techniques for cache partitioning in multicore processors

S Mittal - ACM Computing Surveys (CSUR), 2017 - dl.acm.org
As the number of on-chip cores and memory demands of applications increase, judicious
management of cache resources has become not merely attractive but imperative. Cache …

Adaptive-latency DRAM: Optimizing DRAM timing for the common-case

D Lee, Y Kim, G Pekhimenko, S Khan… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
In current systems, memory accesses to a DRAM chip must obey a set of minimum latency
restrictions specified in the DRAM standard. Such timing parameters exist to guarantee …

[PDF][PDF] Research problems and opportunities in memory systems

O Mutlu, L Subramanian - Supercomputing frontiers and …, 2014 - superfri.susu.ru
The memory system is a fundamental performance and energy bottleneck in almost all
computing systems. Recent system design, application, and technology trends that require …

Load value approximation

J San Miguel, M Badr, NE Jerger - 2014 47th Annual IEEE …, 2014 - ieeexplore.ieee.org
Approximate computing explores opportunities that emerge when applications can tolerate
error or inexactness. These applications, which range from multimedia processing to …

Designing a cost-effective cache replacement policy using machine learning

S Sethumurugan, J Yin, J Sartori - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
Extensive research has been carried out to improve cache replacement policies, yet
designing an efficient cache replacement policy that incurs low hardware overhead remains …

BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches

C Chou, A Jaleel, MK Qureshi - ACM SIGARCH Computer Architecture …, 2015 - dl.acm.org
Die stacking memory technology can enable gigascale DRAM caches that can operate at 4x-
8x higher bandwidth than commodity DRAM. Such caches can improve system performance …

Application clustering policies to address system fairness with intel's cache allocation technology

V Selfa, J Sahuquillo, L Eeckhout… - 2017 26th …, 2017 - ieeexplore.ieee.org
Achieving system fairness is a major design concern in current multicore processors.
Unfairness arises due to contention in the shared resources of the system, such as the LLC …

Kill the program counter: Reconstructing program behavior in the processor cache hierarchy

J Kim, E Teran, PV Gratz, DA Jiménez… - ACM SIGPLAN …, 2017 - dl.acm.org
Data prefetching and cache replacement algorithms have been intensively studied in the
design of high performance microprocessors. Typically, the data prefetcher operates in the …

Exploiting compressed block size as an indicator of future reuse

G Pekhimenko, T Huberty, R Cai… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
We introduce a set of new Compression-Aware Management Policies (CAMP) for on-chip
caches that employ data compression. Our management policies are based on two key …