Compute caches

PROMISE: An end-to-end design of a programmable mixed-signal accelerator for machine-learning algorithms

P Srivastava, M Kang, SK Gonugondla… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org

Analog/mixed-signal machine learning (ML) accelerators exploit the unique computing
capability of analog/mixed-signal circuits and inherent error tolerance of ML algorithms to …

被引用次数：71 相关文章所有 8 个版本

[PDF] acm.org

Täkō: A polymorphic cache hierarchy for general-purpose optimization of data movement

BC Schwedock, P Yoovidhya, J Seibert… - Proceedings of the 49th …, 2022 - dl.acm.org

Current systems hide data movement from software behind the load-store interface.
Software's inability to observe and respond to data movement is the root cause of many …

被引用次数：16 相关文章所有 8 个版本

[PDF] acm.org

KrakenOnMem: a memristor-augmented HW/SW framework for taxonomic profiling

T Shahroodi, M Zahedi, A Singh, S Wong… - Proceedings of the 36th …, 2022 - dl.acm.org

State-of-the-art taxonomic profilers that comprise the first step in larger-context metagenomic
studies have proven to be computationally intensive, ie, while accurate, they come at the …

被引用次数：16 相关文章所有 6 个版本

[PDF] arxiv.org

Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses

R Nadig, M Sadrosadati, H Mao, NM Ghiasi… - Proceedings of the 50th …, 2023 - dl.acm.org

The performance and capacity of solid-state drives (SSDs) are continuously improving to
meet the increasing demands of modern data-intensive applications. Unfortunately …

被引用次数：6 相关文章所有 7 个版本

[PDF] acm.org

Wire-aware architecture and dataflow for cnn accelerators

S Gudaparthi, S Narayanan… - Proceedings of the …, 2019 - dl.acm.org

In spite of several recent advancements, data movement in modern CNN accelerators
remains a significant bottleneck. Architectures like Eyeriss implement large scratchpads …

被引用次数：44 相关文章所有 5 个版本

An eight-core RISC-V processor with compute near last level cache in Intel 4 CMOS

GK Chen, PC Knag, C Tokunaga… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org

An eight-core 64-b processor extends RISC-V to perform multiply–accumulate (MAC) within
the shared last level cache (LLC). Instead of moving data from the LLC to the core, compute …

被引用次数：10 相关文章所有 2 个版本

[PDF] nsf.gov

Stream floating: Enabling proactive and decentralized cache optimizations

Z Wang, J Weng, J Lowe-Power, J Gaur… - … Symposium on High …, 2021 - ieeexplore.ieee.org

As multicore systems continue to grow in scale and on-chip memory capacity, the on-chip
network bandwidth and latency become problematic bottlenecks. Because of this …

被引用次数：26 相关文章所有 10 个版本

[PDF] researchgate.net

A high throughput in-MRAM-computing scheme using hybrid p-SOT-MTJ/GAA-CNTFET

Z Tong, Y Xu, Y Liu, X Duan, H Tang… - … on Circuits and …, 2023 - ieeexplore.ieee.org

Silicon-based semiconductor transistors are approaching their physical limits due to
shrinking feature sizes. Simultaneously, traditional silicon-based von Neumann …

被引用次数：4 相关文章所有 2 个版本

A survey of memory-centric energy efficient computer architecture

C Zhang, H Sun, S Li, Y Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Energy efficient architecture is essential to improve both the performance and power
consumption of a computer system. However, modern computers suffer from the severe …

被引用次数：6 相关文章所有 4 个版本

[PDF] epfl.ch

Rebooting virtual memory with midgard

S Gupta, A Bhattacharyya, Y Oh… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Computer systems designers are building cache hierarchies with higher capacity to capture
the ever-increasing working sets of modern workloads. Cache hierarchies with higher …

被引用次数：24 相关文章所有 16 个版本

高级搜索

QQ 群

PROMISE: An end-to-end design of a programmable mixed-signal accelerator for machine-learning algorithms

Täkō: A polymorphic cache hierarchy for general-purpose optimization of data movement

KrakenOnMem: a memristor-augmented HW/SW framework for taxonomic profiling

Venice: Improving Solid-State Drive Parallelism at Low Cost via Conflict-Free Accesses

Wire-aware architecture and dataflow for cnn accelerators

An eight-core RISC-V processor with compute near last level cache in Intel 4 CMOS

Stream floating: Enabling proactive and decentralized cache optimizations

A high throughput in-MRAM-computing scheme using hybrid p-SOT-MTJ/GAA-CNTFET

A survey of memory-centric energy efficient computer architecture

Rebooting virtual memory with midgard

引用