Differentiable joint pruning and quantization for hardware efficiency

Y Wang, Y Lu, T Blankevoort - European Conference on Computer Vision, 2020 - Springer
We present a differentiable joint pruning and quantization (DJPQ) scheme. We frame neural
network compression as a joint gradient-based optimization problem, trading off between
model pruning and quantization automatically for hardware efficiency. DJPQ incorporates
variational information bottleneck based structured pruning and mixed-bit precision
quantization into a single differentiable loss function. In contrast to previous works which
consider pruning and quantization separately, our method enables users to find the optimal …

[PDF][PDF] Differentiable Joint Pruning and Quantization for Hardware Efficiency

T Blankevoort - ecva.net
We present a differentiable joint pruning and quantization (DJPQ) scheme. We frame neural
network compression as a joint gradientbased optimization problem, trading off between
model pruning and quantization automatically for hardware efficiency. DJPQ incorporates
variational information bottleneck based structured pruning and mixedbit precision
quantization into a single differentiable loss function. In contrast to previous works which
consider pruning and quantization separately, our method enables users to find the optimal …
以上显示的是最相近的搜索结果。 查看全部搜索结果