Transmuter: Bridging the efficiency gap using memory and dataflow reconfiguration

G Gerogiannis, S Yesil, D Lenadora, D Cao… - Proceedings of the 50th …, 2023 - dl.acm.org

The widespread use of Sparse Matrix Dense Matrix Multiplication (SpMM) and Sampled
Dense Matrix Dense Matrix Multiplication (SDDMM) kernels makes them candidates for …

被引用次数：10 相关文章所有 6 个版本

[PDF] umich.edu

Menda: A near-memory multi-way merge solution for sparse transposition and dataflows

S Feng, X He, KY Chen, L Ke, X Zhang… - Proceedings of the 49th …, 2022 - dl.acm.org

Near-memory processing has been extensively studied to optimize memory intensive
workloads. However, none of the proposed designs address sparse matrix transposition, an …

被引用次数：16 相关文章所有 3 个版本

[PDF] acm.org

SparseAdapt: Runtime control for sparse linear algebra on a reconfigurable accelerator

S Pal, A Amarnath, S Feng, M O'Boyle… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Dynamic adaptation is a post-silicon optimization technique that adapts the hardware to
workload phases. However, current adaptive approaches are oblivious to implicit phases …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Special session: Towards an agile design methodology for efficient, reliable, and secure ML systems

S Dave, A Marchisio, MA Hanif… - 2022 IEEE 40th VLSI …, 2022 - ieeexplore.ieee.org

The real-world use cases of Machine Learning (ML) have exploded over the past few years.
However, the current computing infrastructure is insufficient to support all real-world …

被引用次数：15 相关文章所有 18 个版本

[PDF] ed.ac.uk

Cosparse: A software and hardware reconfigurable spmv framework for graph analytics

S Feng, J Sun, S Pal, X He, K Kaszyk… - 2021 58th ACM/IEEE …, 2021 - ieeexplore.ieee.org

Sparse matrix-vector multiplication (SpMV) is a critical building block for iterative graph
analytics algorithms. Typically, such algorithms have a varying active vertex set across …

被引用次数：21 相关文章所有 6 个版本

[PDF] umich.edu

Versa: A 36-core systolic multiprocessor with dynamically reconfigurable interconnect and memory

S Kim, M Fayazi, A Daftardar, KY Chen… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org

We present Versa, an energy-efficient 36-core systolic multiprocessor with dynamically
reconfigurable interconnects and memory. Versa leverages reconfigurable functional units …

被引用次数：12 相关文章所有 7 个版本

[PDF] arxiv.org

An adjustable farthest point sampling method for approximately-sorted point cloud data

J Li, J Zhou, Y Xiong, X Chen… - 2022 IEEE Workshop …, 2022 - ieeexplore.ieee.org

Sampling is an essential part of raw point cloud data processing, such as in the popular
PointNet++ scheme. Farthest Point Sampling (FPS), which iteratively samples the farthest …

被引用次数：10 相关文章所有 4 个版本

[PDF] ieee.org

Blocks: Challenging simds and vliws with a reconfigurable architecture

M Wijtvliet, A Kumar, H Corporaal - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Demand for coarse grain reconfigurable architectures (CGRAs) has significantly increased
in recent years as architectures need to be both energy efficient and flexible. However, most …

被引用次数：10 相关文章所有 3 个版本

[PDF] archive.org

OnSRAM: Efficient inter-node on-chip scratchpad management in deep learning accelerators

S Pal, S Venkataramani, V Srinivasan… - ACM Transactions on …, 2022 - dl.acm.org

Hardware acceleration of Artificial Intelligence (AI) workloads has gained widespread
popularity with its potential to deliver unprecedented performance and efficiency. An …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

Rewriting History: Repurposing Domain-Specific CGRAs

J Woodruff, T Koehler, A Brauckmann… - arXiv preprint arXiv …, 2023 - arxiv.org

Coarse-grained reconfigurable arrays (CGRAs) are domain-specific devices promising both
the flexibility of FPGAs and the performance of ASICs. However, with restricted domains …

被引用次数：1 相关文章所有 3 个版本

高级搜索

QQ 群