Spin-transfer torque memories: Devices, circuits, and systems

X Fong, Y Kim, R Venkatesan, SH Choday… - Proceedings of the …, 2016 - ieeexplore.ieee.org
Spin-transfer torque magnetic memory (STT-MRAM) has gained significant research interest
due to its nonvolatility and zero standby leakage, near unlimited endurance, excellent …

A survey of cache bypassing techniques

S Mittal - Journal of Low Power Electronics and Applications, 2016 - mdpi.com
With increasing core-count, the cache demand of modern processors has also increased.
However, due to strict area/power budgets and presence of poor data-locality workloads …

COBRRA: COntention-aware cache Bypass with Request-Response Arbitration

A Bagchi, D Joshi, PR Panda - ACM Transactions on Embedded …, 2024 - dl.acm.org
In modern multi-processor systems-on-chip (MPSoCs), requests from different processor
cores, accelerators, and their responses from the lower-level memory contend for the shared …

Read-tuned STT-RAM and eDRAM cache hierarchies for throughput and energy optimization

N Khoshavi, RF Demara - IEEE Access, 2018 - ieeexplore.ieee.org
As capacity and complexity of on-chip cache memory hierarchy increases, the service cost to
the critical loads from last level cache (LLC), which are frequently repeated, has become a …

Pm3: Power modeling and power management for processing-in-memory

C Zhang, T Meng, G Sun - 2018 IEEE International symposium …, 2018 - ieeexplore.ieee.org
Processing-in-Memory (PIM) has been proposed as a solution to accelerate data-intensive
applications, such as real-time Big Data processing and neural networks. The acceleration …

POEM: Performance Optimization and Endurance Management for Non-volatile Caches

A Bagchi, Dharamjeet, O Rishabh, M Suri… - ACM Transactions on …, 2024 - dl.acm.org
Non-volatile memories (NVMs) with their high storage density and ultra-low leakage power
offer promising potential for redesigning the memory hierarchy in next-generation Multi …

Holistic management of the GPGPU memory hierarchy to manage warp-level latency tolerance

R Ausavarungnirun, S Ghose, O Kayıran… - arXiv preprint arXiv …, 2018 - arxiv.org
In a modern GPU architecture, all threads within a warp execute the same instruction in
lockstep. For a memory instruction, this can lead to memory divergence: the memory …

NOVELLA: Nonvolatile Last-Level Cache Bypass for Optimizing Off-Chip Memory Energy

A Bagchi, O Rishabh, PR Panda - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Contemporary multiprocessor systems-on-chips (MPSoCs) continue to confront energy-
related challenges, primarily originating from off-chip data movements. Nonvolatile …

Techniques for shared resource management in systems with throughput processors

R Ausavarungnirun - 2017 - search.proquest.com
The continued growth of the computational capability of throughput processors has made
throughput processors the platform of choice for a wide variety of high performance …

PROLONG: Priority based Write Bypassing Technique for Longer Lifetime in STT-RAM based LLC

P Sinha, KP BV, S Das, VK Tavva - Proceedings of the International …, 2024 - dl.acm.org
The rise of data-driven applications requires larger on-chip Last Level Caches (LLCs) in
multicore systems which need denser chips with lower power consumption. Non-Volatile …