Syncron: Efficient synchronization support for near-data-processing architectures

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

Cache coherence for GPU architectures

I Singh, A Shriraman, WWL Fung… - 2013 IEEE 19th …, 2013 - ieeexplore.ieee.org
While scalable coherence has been extensively studied in the context of general purpose
chip multiprocessors (CMPs), GPU architectures present a new set of challenges …

Write-light cache for energy harvesting systems

J Choi, J Zeng, D Lee, C Min, C Jung - Proceedings of the 50th Annual …, 2023 - dl.acm.org
Energy harvesting system has huge potential to enable battery-less Internet of Things (IoT)
services. However, it has been designed without a cache due to the difficulty of crash …

Selective GPU caches to eliminate CPU-GPU HW cache coherence

N Agarwal, D Nellans, E Ebrahimi… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Cache coherence is ubiquitous in shared memory multiprocessors because it provides a
simple, high performance memory abstraction to programmers. Recent work suggests …

QuickRelease: A throughput-oriented approach to release consistency on GPUs

BA Hechtman, S Che, DR Hower, Y Tian… - 2014 IEEE 20th …, 2014 - ieeexplore.ieee.org
Graphics processing units (GPUs) have specialized throughput-oriented memory systems
optimized for streaming writes with scratchpad memories to capture locality explicitly …

A new perspective for efficient virtual-cache coherence

S Kaxiras, A Ros - Proceedings of the 40th Annual International …, 2013 - dl.acm.org
Coherent shared virtual memory (cSVM) is highly coveted for heterogeneous architectures
as it will simplify programming across different cores and manycore accelerators. In this …

TSO-CC: Consistency directed cache coherence for TSO

M Elver, V Nagarajan - 2014 IEEE 20th International …, 2014 - ieeexplore.ieee.org
Traditional directory coherence protocols are designed for the strictest consistency model,
sequential consistency (SC). When they are used for chip multiprocessors (CMPs) that …

Turning centralized coherence and distributed critical-section execution on their head: A new approach for scalable distributed shared memory

S Kaxiras, D Klaftenegger, M Norgren, A Ros… - Proceedings of the 24th …, 2015 - dl.acm.org
A coherent global address space in a distributed system enables shared memory
programming in a much larger scale than a single multicore or a single SMP. Without …

Lazy release consistency for GPUs

J Alsop, MS Orr, BM Beckmann… - 2016 49th Annual IEEE …, 2016 - ieeexplore.ieee.org
The heterogeneous-race-free (HRF) memory model has been embraced by the
Heterogeneous System Architecture (HSA) Foundation and OpenCL TM because it clearly …

Estimating effort by use case points: method, tool and case study

S Kusumoto, F Matukawa, K Inoue… - … on Software Metrics …, 2004 - ieeexplore.ieee.org
Use case point (UCP) method has been proposed to estimate software development effort in
early phase of software project and used in a lot of software organizations. Intuitively, UCP is …