Novel accelerated methods for convolution neural network with matrix core

Y Guo, L Lu, S Zhu - The Journal of Supercomputing, 2023 - Springer
The powerful parallel computing capability of GPU and the development of matrix
processing unit in recent years provide more possibilities to improve the performance of …

WASP: Exploiting GPU Pipeline Parallelism with Hardware-Accelerated Automatic Warp Specialization

NC Crago, S Damani, K Sankaralingam… - … Symposium on High …, 2024 - ieeexplore.ieee.org
Graphics processing units (GPUs) are an important class of parallel processors that offer
high compute throughput and memory bandwidth. GPUs are used in a variety of important …

A Decomposable Winograd Method for N–D Convolution Acceleration in Video Analysis

D Huang, R Zhang, X Zhang, F Wu, X Wang… - International Journal of …, 2021 - Springer
Winograd's minimal filtering algorithm has been widely used in 2-D Convolutional Neural
Networks (CNNs) to reduce the number of multiplications for faster processing. However, it is …