Learning low-precision structured subnetworks using joint layerwise channel pruning and uniform quantization
Pruning and quantization are core techniques used to reduce the inference costs of deep
neural networks. Among the state-of-the-art pruning techniques, magnitude-based pruning …
neural networks. Among the state-of-the-art pruning techniques, magnitude-based pruning …
[HTML][HTML] MOHAQ: Multi-Objective Hardware-Aware Quantization of recurrent neural networks
The compression of deep learning models is of fundamental importance in deploying such
models to edge devices. The selection of compression parameters can be automated to …
models to edge devices. The selection of compression parameters can be automated to …
[图书][B] End-to-End Inference Optimization for Deep Learning-based Image Upsampling Networks
I Colbert - 2023 - search.proquest.com
Many computer vision problems require image upsampling, where the number of pixels per
unit area is increased by inferring values in high-dimensional image space from …
unit area is increased by inferring values in high-dimensional image space from …