Learning low-precision structured subnetworks using joint layerwise channel pruning and uniform quantization

X Zhang, I Colbert, S Das - Applied Sciences, 2022 - mdpi.com
Pruning and quantization are core techniques used to reduce the inference costs of deep
neural networks. Among the state-of-the-art pruning techniques, magnitude-based pruning …

[HTML][HTML] MOHAQ: Multi-Objective Hardware-Aware Quantization of recurrent neural networks

NM Rezk, T Nordström, D Stathis, Z Ul-Abdin… - Journal of systems …, 2022 - Elsevier
The compression of deep learning models is of fundamental importance in deploying such
models to edge devices. The selection of compression parameters can be automated to …

[图书][B] End-to-End Inference Optimization for Deep Learning-based Image Upsampling Networks

I Colbert - 2023 - search.proquest.com
Many computer vision problems require image upsampling, where the number of pixels per
unit area is increased by inferring values in high-dimensional image space from …