Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer …
Z Liu, J Leng, Z Zhang, Q Chen, C Li… - Proceedings of the 27th …, 2022 - dl.acm.org
Deep learning (DL) models have achieved great success in many application domains. As such, many industrial companies such as Google and Facebook have acknowledged the …
Y Feng, G Hammonds, Y Gan, Y Zhu - Proceedings of the 49th Annual …, 2022 - dl.acm.org
3D perception in point clouds is transforming the perception ability of future intelligent machines. Point cloud algorithms, however, are plagued by irregular memory accesses …
S Barrachina, MF Dolz, P San Juan… - Journal of Parallel and …, 2022 - Elsevier
Abstract Convolutional Neural Networks (CNNs) play a crucial role in many image recognition and classification tasks, recommender systems, brain-computer interfaces, etc …
This article proposes a novel hardware accelerator for the inference task with sparse convolutional neural networks (CNNs) by building a hardware unit to perform Image to …
Deep Neural Network (DNN) inference based on quantized narrow-precision integer data represents a promising research direction toward efficient deep learning computations on …
Y Tan, K Han, K Zhao, X Yu, Z Du… - Advances in …, 2022 - proceedings.neurips.cc
Weight sparsity is a promising approach to reducing the model size and computation cost of convolutional neural networks (CNNs). Nevertheless, non-zero weights often distribute …
S Hwang, S Lee, J Kim, H Kim… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Multi-core neural processing units (NPUs) have emerged to scale the computation capability of NPUs to efficiently support diverse machine learning tasks. In such multi-core NPUs …