ZigZag: Enlarging joint architecture-mapping design space exploration for DNN accelerators

S Kim, C Hooper, T Wattanawong, M Kang… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

被引用次数：51 相关文章所有 4 个版本

Diana: An end-to-end hybrid digital and analog neural network soc for the edge

P Houshmand, GM Sarda, V Jain… - IEEE Journal of Solid …, 2022 - ieeexplore.ieee.org

DIgital-ANAlog (DIANA), a heterogeneous multi-core accelerator, combines a reduced
instruction set computer-five (RISC-V) host processor with an analog in-memory computing …

被引用次数：38 相关文章所有 2 个版本

[PDF] acm.org

A full-stack search technique for domain optimized deep learning accelerators

D Zhang, S Huda, E Songhori, K Prabhu, Q Le… - Proceedings of the 27th …, 2022 - dl.acm.org

The rapidly-changing deep learning landscape presents a unique opportunity for building
inference accelerators optimized for specific datacenter-scale workloads. We propose Full …

被引用次数：51 相关文章所有 3 个版本

[PDF] acm.org

Dosa: Differentiable model-based one-loop search for dnn accelerators

C Hong, Q Huang, G Dinh, M Subedar… - Proceedings of the 56th …, 2023 - dl.acm.org

In the hardware design space exploration process, it is critical to optimize both hardware
parameters and algorithm-to-hardware mappings. Previous work has largely approached …

被引用次数：7 相关文章所有 5 个版本

[PDF] arxiv.org

Defines: Enabling fast exploration of the depth-first scheduling space for dnn accelerators through analytical modeling

L Mei, K Goetschalckx, A Symons… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

DNN workloads can be scheduled onto DNN accelerators in many different ways: from layer-
by-layer scheduling to cross-layer depth-first scheduling (aka layer fusion, or cascaded …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

Tinyvers: A tiny versatile system-on-chip with state-retentive eMRAM for ML inference at the extreme edge

V Jain, S Giraldo, J De Roose, L Mei… - IEEE Journal of Solid …, 2023 - ieeexplore.ieee.org

Extreme edge devices or Internet-of-Things (IoT) nodes require both ultra-low power (ULP)
always-on (AON) processing as well as the ability to do on-demand sampling and …

被引用次数：19 相关文章所有 4 个版本

[PDF] acm.org

Telamalloc: Efficient on-chip memory allocation for production machine learning accelerators

M Maas, U Beaugnon, A Chauhan, B Ilbeyi - Proceedings of the 28th …, 2022 - dl.acm.org

Memory buffer allocation for on-chip memories is a major challenge in modern machine
learning systems that target ML accelerators. In interactive systems such as mobile phones …

被引用次数：15 相关文章所有 2 个版本

[PDF] utexas.edu

Leveraging domain information for the efficient automated design of deep learning accelerators

C Sakhuja, Z Shi, C Lin - 2023 IEEE International Symposium …, 2023 - ieeexplore.ieee.org

Deep learning accelerators are important tools for feeding the growing demand for deep
learning applications. The automated design of such accelerators—which is important for …

被引用次数：10 相关文章所有 6 个版本

[PDF] arxiv.org

Benchmarking and modeling of analog and digital SRAM in-memory computing architectures

P Houshmand, J Sun, M Verhelst - arXiv preprint arXiv:2305.18335, 2023 - arxiv.org

In-memory-computing is emerging as an efficient hardware paradigm for deep neural
network accelerators at the edge, enabling to break the memory wall and exploit massive …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

Demystifying map space exploration for NPUs

SC Kao, A Parashar, PA Tsai… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

Map Space Exploration is the problem of finding optimized mappings of a Deep Neural
Network (DNN) model on an accelerator. It is known to be extremely computationally …

被引用次数：12 相关文章所有 7 个版本

高级搜索

QQ 群