SOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inference

W Wang, S Zhou, W Sun, P Sun… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org
Transformers have shown remarkable performance in both natural language processing
(NLP) and computer vision (CV) tasks. However, their real-time inference speed and …

MASL-AFU: A High Memory Access Efficiency 2-D Scalable LUT-Based Activation Function Unit for On-Device DNN Training

Z Meng, L Shu, J Zeng, Z Li, K Lv… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
On-device deep neural network (DNN) training faces constraints in storage capacity and
energy supply. Existing works primarily focus on optimizing the training of convolutional and …

Special session: When dataflows converge: Reconfigurable and approximate computing for emerging neural networks

D Wu, J San Miguel - 2021 IEEE 39th International Conference …, 2021 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have gained significant attention in both academia and
industry due to the superior application-level accuracy. As DNNs rely on compute-or …

[图书][B] Power-Efficient Computer Architecture via Unary and Approximate Computing

D Wu - 2023 - search.proquest.com
In the last decade, deep learning has gained soaring popularity in both academia and
industry, playing an indispensable role in many aspects of human lives. The resource …

[图书][B] Empowering Security and Privacy-Preserving Interactions for Smart Device Users

J Li - 2023 - search.proquest.com
Emerging smart devices, such as smart home and augmented/virtual reality systems, are
reforming our living experience by automating our daily routines and interacting with us …