Viterbi-based pruning for sparse matrix with fixed and high index compression ratio

J Park, S Samarakoon, M Bennis… - Proceedings of the …, 2019 - ieeexplore.ieee.org

Fueled by the availability of more data and computing power, recent breakthroughs in cloud-
based machine learning (ML) have transformed every aspect of our lives from face …

被引用次数：657 相关文章所有 11 个版本

[PDF] mlr.press

Efficient neural audio synthesis

N Kalchbrenner, E Elsen, K Simonyan… - International …, 2018 - proceedings.mlr.press

Sequential models achieve state-of-the-art results in audio, visual and textual domains with
respect to both estimating the data distribution and generating desired samples. Efficient …

被引用次数：1055 相关文章所有 8 个版本

[PDF] thecvf.com

Structured compression by weight encryption for unstructured pruning and quantization

SJ Kwon, D Lee, B Kim, P Kapoor… - Proceedings of the …, 2020 - openaccess.thecvf.com

Abstract Model compression techniques, such as pruning and quantization, are becoming
increasingly important to reduce the memory footprints and the amount of computations …

被引用次数：51 相关文章所有 7 个版本

[PDF] nsf.gov

CSCNN: Algorithm-hardware co-design for CNN accelerators using centrosymmetric filters

J Li, A Louri, A Karanth… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org

Convolutional neural networks (CNNs) are at the core of many state-of-the-art deep learning
models in computer vision, speech, and text processing. Training and deploying such CNN …

被引用次数：15 相关文章所有 4 个版本

[PDF] arxiv.org

Deeptwist: Learning model compression via occasional weight distortion

D Lee, P Kapoor, B Kim - arXiv preprint arXiv:1810.12823, 2018 - arxiv.org

Model compression has been introduced to reduce the required hardware resources while
maintaining the model accuracy. Lots of techniques for model compression, such as …

被引用次数：23 相关文章所有 3 个版本

[PDF] arxiv.org

Learning low-rank approximation for cnns

D Lee, SJ Kwon, B Kim, GY Wei - arXiv preprint arXiv:1905.10145, 2019 - arxiv.org

Low-rank approximation is an effective model compression technique to not only reduce
parameter storage requirements, but to also reduce computations. For convolutional neural …

被引用次数：22 相关文章所有 2 个版本

[PDF] neurips.cc

Flexor: Trainable fractional quantization

D Lee, SJ Kwon, B Kim, Y Jeon… - Advances in neural …, 2020 - proceedings.neurips.cc

Quantization based on the binary codes is gaining attention because each quantized bit can
be directly utilized for computations without dequantization using look-up tables. Previous …

被引用次数：15 相关文章所有 6 个版本

[PDF] arxiv.org

Gst: Group-sparse training for accelerating deep reinforcement learning

J Lee, S Kim, S Kim, W Jo, HJ Yoo - arXiv preprint arXiv:2101.09650, 2021 - arxiv.org

Deep reinforcement learning (DRL) has shown remarkable success in sequential decision-
making problems but suffers from a long training time to obtain such good performance …

被引用次数：11 相关文章所有 4 个版本

[PDF] openreview.net

Double Viterbi: Weight encoding for high compression ratio and fast on-chip reconstruction for deep neural network

D Ahn, D Lee, T Kim, JJ Kim - International Conference on Learning …, 2018 - openreview.net

Weight pruning has been introduced as an efficient model compression technique. Even
though pruning removes significant amount of weights in a network, memory requirement …

被引用次数：14 相关文章所有 2 个版本

[PDF] aaai.org

Layerwise sparse coding for pruned deep neural networks with extreme compression ratio

X Liu, W Li, J Huo, L Yao, Y Gao - Proceedings of the AAAI Conference on …, 2020 - aaai.org

Deep neural network compression is important and increasingly developed especially in
resource-constrained environments, such as autonomous drones and wearable devices …

被引用次数：10 相关文章所有 6 个版本

高级搜索

QQ 群