Fact: Ffn-attention co-optimized transformer architecture with eager correlation prediction

Y Qin, Y Wang, D Deng, Z Zhao, X Yang, L Liu… - Proceedings of the 50th …, 2023 - dl.acm.org
Transformer model is becoming prevalent in various AI applications with its outstanding
performance. However, the high cost of computation and memory footprint make its …

Sofa: A compute-memory optimized sparsity accelerator via cross-stage coordinated tiling

H Wang, J Fang, X Tang, Z Yue, J Li… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Benefiting from the self-attention mechanism, Transformer models have attained impressive
contextual comprehension capabilities for lengthy texts. The requirements of high …

SG-Float: Achieving Memory Access and Computing Power Reduction Using Self-Gating Float in CNNs

JS Wu, TW Hsu, RS Liu - ACM Transactions on Embedded Computing …, 2023 - dl.acm.org
Convolutional neural networks (CNNs) are essential for advancing the field of artificial
intelligence. However, since these networks are highly demanding in terms of memory and …

SySMOL: A Hardware-software Co-design Framework for Ultra-Low and Fine-Grained Mixed-Precision Neural Networks

C Zhou, V Richard, P Savarese, Z Hassman… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in quantization and mixed-precision techniques offer significant
promise for improving the run-time and energy efficiency of neural networks. In this work, we …

BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration

M Shi, V Jain, A Joseph, M Meijer… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Bit-serial computation facilitates bit-wise sequential data processing, offering numerous
benefits, such as a reduced area footprint and dynamically-adaptive computational …

Pianissimo: A Sub-mW Class DNN Accelerator With Progressively Adjustable Bit-Precision

J Suzuki, J Yu, M Yasunaga, ÁL García-Arias… - IEEE …, 2023 - ieeexplore.ieee.org
With the widespread adoption of edge AI, the diversity of application requirements and
fluctuating computational demands present significant challenges. Conventional …

Progressive Variable Precision DNN With Bitwise Ternary Accumulation

J Suzuki, M Yasunaga, K Kawamura… - 2024 IEEE 6th …, 2024 - ieeexplore.ieee.org
Progressive variable precision networks are capable of adapting to changing computational
needs over time using a single weight set. However, previous works have two problems: 1) …