Chainsaw: Von-neumann accelerators to leverage fused instruction chains

Z Wang, T Nowatzki - Proceedings of the 46th International Symposium …, 2019 - dl.acm.org

Because of severe limitations in technology scaling, architects have innovated in
specializing general purpose processors for computation primitives (eg vector instructions …

被引用次数：55 相关文章所有 4 个版本

[PDF] inesc-id.pt

Unlimited vector extension with data streaming support

JM Domingos, N Neves, N Roma… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Unlimited vector extension (UVE) is a novel instruction set architecture extension that takes
streaming and SIMD processing together into the modern computing scenario. It aims to …

被引用次数：34 相关文章所有 7 个版本

[PDF] arxiv.org

DPU-v2: Energy-efficient execution of irregular directed acyclic graphs

N Shah, W Meert, M Verhelst - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

A growing number of applications like probabilistic machine learning, sparse linear algebra,
robotic navigation, etc., exhibit irregular data flow computation that can be modeled with …

被引用次数：8 相关文章所有 6 个版本

[PDF] acm.org

Novia: A framework for discovering non-conventional inline accelerators

D Trilla, JD Wellman, A Buyuktosunoglu… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Accelerators provide an increasingly valuable source of performance in modern computing
systems. In most cases, accelerators are implemented as stand-alone, offload engines to …

被引用次数：10 相关文章所有 3 个版本

[PDF] sfu.ca

Deepframe: A profile-driven compiler for spatial hardware accelerators

A Guha, N Vedula, A Shriraman - 2019 28th International …, 2019 - ieeexplore.ieee.org

Tracing code paths to form extended basic blocks is useful in many areas, compiler
optimizations [1], improving instruction cache behavior [2] and custom-hardware offloading …

被引用次数：16 相关文章所有 4 个版本

[PDF] psu.edu

Characterizing diverse handheld apps for customized hardware acceleration

PV Rengasamy, H Zhang… - 2017 IEEE …, 2017 - ieeexplore.ieee.org

Current handhelds incorporate a variety of acceler-ators/IPs for improving their performance
and energy efficiency. While these IPs are extremely useful for accelerating parts of a …

被引用次数：20 相关文章所有 5 个版本

[PDF] acm.org

Mirage cores: The illusion of many out-of-order cores using in-order hardware

S Padmanabha, A Lukefahr, R Das… - Proceedings of the 50th …, 2017 - dl.acm.org

Heterogenous chip multiprocessors (Het-CMPs) offer a combination of large Out-of-Order
(OoO) cores optimized for high single-threaded performance and small In-Order (InO) cores …

被引用次数：13 相关文章所有 6 个版本

[图书][B] Efficient Execution of Irregular Dataflow Graphs: Hardware/Software Co-optimization for Probabilistic AI and Sparse Linear Algebra

N Shah, W Meert, M Verhelst - 2023 - books.google.com

This book focuses on the acceleration of emerging irregular sparse workloads, posed by
novel artificial intelligent (AI) models and sparse linear algebra. Specifically, the book …

被引用次数：4 相关文章所有 2 个版本

[PDF] nsf.gov

Decentralized offload-based execution on memory-centric compute cores

S Baskaran, J Sampson - … of the International Symposium on Memory …, 2020 - dl.acm.org

With the end of Dennard scaling, power constraints have led to increasing compute
specialization in the form of differently specialized accelerators integrated at various levels …

被引用次数：8 相关文章所有 4 个版本

Nachos: Software-driven hardware-assisted memory disambiguation for accelerators

N Vedula, A Shriraman, S Kumar… - … Symposium on High …, 2018 - ieeexplore.ieee.org

Hardware accelerators have relied on the compiler to extract instruction parallelism but may
waste significant energy in enforcing memory ordering and discovering memory parallelism …

被引用次数：10 相关文章所有 2 个版本

高级搜索

QQ 群