Cohesion: a hybrid memory model for accelerators

C Giannoula, N Vijaykumar… - … Symposium on High …, 2021 - ieeexplore.ieee.org

Near-Data-Processing (NDP) architectures present a promising way to alleviate data
movement costs and can provide significant performance and energy benefits to parallel …

被引用次数：96 相关文章所有 13 个版本

[PDF] berkeley.edu

Co-designing accelerators and SoC interfaces using gem5-Aladdin

YS Shao, SL Xi, V Srinivasan, GY Wei… - 2016 49th Annual …, 2016 - ieeexplore.ieee.org

Increasing demand for power-efficient, high-performance computing has spurred a growing
number and diversity of hardware accelerators in mobile and server Systems on Chip …

被引用次数：218 相关文章所有 7 个版本

[PDF] wisc.edu

Heterogeneous system coherence for integrated CPU-GPU systems

J Power, A Basu, J Gu, S Puthoor… - Proceedings of the 46th …, 2013 - dl.acm.org

Many future heterogeneous systems will integrate CPUs and GPUs physically on a single
chip and logically connect them via shared memory to avoid explicit data copying. Making …

被引用次数：216 相关文章所有 15 个版本

[PDF] iastate.edu

Cache coherence for GPU architectures

I Singh, A Shriraman, WWL Fung… - 2013 IEEE 19th …, 2013 - ieeexplore.ieee.org

While scalable coherence has been extensively studied in the context of general purpose
chip multiprocessors (CMPs), GPU architectures present a new set of challenges …

被引用次数：220 相关文章所有 16 个版本

[PDF] researchgate.net

Throughput-effective on-chip networks for manycore accelerators

A Bakhoda, J Kim, TM Aamodt - 2010 43rd Annual IEEE/ACM …, 2010 - ieeexplore.ieee.org

As the number of cores and threads in manycore compute accelerators such as Graphics
Processing Units (GPU) increases, so does the importance of on-chip interconnection …

被引用次数：212 相关文章所有 13 个版本

[HTML] worktribe.com

Heterogeneous-race-free memory models

DR Hower, BA Hechtman, BM Beckmann… - Proceedings of the 19th …, 2014 - dl.acm.org

Commodity heterogeneous systems (eg, integrated CPUs and GPUs), now support a
unified, shared memory address space for all components. Because the latency of global …

被引用次数：135 相关文章所有 11 个版本

[PDF] acm.org

Cohmeleon: Learning-based orchestration of accelerator coherence in heterogeneous SoCs

J Zuckerman, D Giri, J Kwon, P Mantovani… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

One of the most critical aspects of integrating loosely-coupled accelerators in
heterogeneous SoC architectures is orchestrating their interactions with the memory …

被引用次数：25 相关文章所有 7 个版本

[PDF] academia.edu

DeNovoND: Efficient hardware support for disciplined non-determinism

H Sung, R Komuravelli, SV Adve - ACM SIGPLAN Notices, 2013 - dl.acm.org

Recent work has shown that disciplined shared-memory programming models that provide
deterministic-by-default semantics can simplify both parallel software and hardware …

被引用次数：81 相关文章所有 12 个版本

[PDF] princeton.edu

ArMOR: Defending against memory consistency model mismatches in heterogeneous architectures

D Lustig, C Trippel, M Pellauer… - Proceedings of the 42nd …, 2015 - dl.acm.org

Architectural heterogeneity is increasing: numerous products and studies have proven the
benefits of combining cores and accelerators with varying ISAs into a single system …

被引用次数：53 相关文章所有 18 个版本

[PDF] iastate.edu

Exploring memory consistency for massively-threaded throughput-oriented processors

BA Hechtman, DJ Sorin - Proceedings of the 40th Annual International …, 2013 - dl.acm.org

We re-visit the issue of hardware consistency models in the new context of massively-
threaded throughput-oriented processors (MTTOPs). A prominent example of an MTTOP is a …

被引用次数：63 相关文章所有 7 个版本

高级搜索

QQ 群